Body and Framing
HTTP/1.1 messages optionally carry a message body after the header section. The critical question for any parser is: where does the body end? Getting this wrong is the root cause of HTTP request smuggling.
When Is a Body Present?
- Requests — a body is present if
Content-LengthorTransfer-Encodingis set.GET,HEAD,DELETE, andOPTIONStypically have no body (though the spec doesn’t forbid it). - Responses — all responses to
HEADrequests and all1xx,204, and304responses have no body. Everything else may have a body.
Content-Length
The Content-Length header declares the exact size of the body in bytes as a decimal integer:
POST /data HTTP/1.1
Host: example.com
Content-Type: text/plain
Content-Length: 13
Hello, World!The parser reads exactly 13 bytes after the empty line, then the next bytes are the start of the next message (on a persistent connection) or the connection ends.
Rules
- The value MUST be a non-negative decimal integer.
- No leading zeros —
Content-Length: 007is invalid. - No signs —
Content-Length: +13orContent-Length: -1are invalid. - No whitespace within the value —
Content-Length: 1 3is invalid. - If
Content-Lengthdoesn’t match the actual body size, the message is malformed. The server SHOULD close the connection. - Multiple
Content-Lengthheaders are allowed only if all values are identical. If they differ, the message is malformed and MUST be rejected.
Why Strictness Matters
Lenient parsing of Content-Length is a common source of vulnerabilities:
Content-Length: 0x0d— if parsed as hex, this is 13 bytes. If parsed as decimal, it’s invalid. A parser mismatch between front-end and back-end enables smuggling.Content-Length: 13, 14— a list of two differing values. One parser might take the first, another the last.
Chunked Transfer Encoding
When the total body size is unknown at the time headers are sent (streaming, server-generated content, compression), HTTP/1.1 uses chunked transfer encoding.
Format
chunk-size (hex) CRLF
chunk-data CRLF
...
0 CRLF
[ trailer-section ]
CRLFEach chunk starts with the chunk size in hexadecimal, followed by CRLF, then exactly that many bytes of data, followed by CRLF. A zero-length chunk signals the end of the body.
Full Example
HTTP/1.1 200 OK
Transfer-Encoding: chunked
4\r\n
Wiki\r\n
7\r\n
pedia i\r\n
B\r\n
n chunks.\r\n
0\r\n
\r\nDecoded body: Wikipedia in chunks.
Chunk Extensions
A chunk-size may be followed by semicolon-separated extensions:
a;ext-name=ext-value\r\n
0123456789\r\nMost servers and proxies ignore chunk extensions. They exist for potential use cases like per-chunk checksums or metadata, but are rarely used in practice. Some security tools test whether servers handle unexpected extensions safely.
Trailers
After the final zero-length chunk, trailer fields may appear — headers sent after the body:
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Trailer: Checksum
4\r\n
data\r\n
0\r\n
Checksum: abc123\r\n
\r\nTrailers are useful for:
- Checksums/signatures — computed as the body streams.
- Processing status — whether the server completed successfully.
- Metadata — anything that can’t be determined until after the body is generated.
The Trailer header in the response declares which trailer fields to expect (though this is advisory, not enforced).
Rules
- Chunk sizes MUST be hexadecimal, case-insensitive (
aandAare both valid). - A zero-length chunk MUST be present to terminate the body.
- After the zero-length chunk, the trailer section and final CRLF complete the message.
Content-Length vs Transfer-Encoding
A message MUST NOT contain both Content-Length and Transfer-Encoding.
RFC 9112 §6.1 is explicit:
If a message is received with both a Transfer-Encoding and a Content-Length header field, the Transfer-Encoding overrides the Content-Length. Such a message might indicate an attempt to perform request smuggling or response splitting and ought to be handled as an error.
The Request Smuggling Problem
This ambiguity is the root cause of HTTP request smuggling. Consider a message with both headers:
POST / HTTP/1.1
Host: example.com
Content-Length: 6
Transfer-Encoding: chunked
0\r\n
\r\n
GPOST- A parser that uses Transfer-Encoding sees a zero-length chunk → body ends immediately. The remaining bytes (
GPOST) are the start of the next request. - A parser that uses Content-Length reads 6 bytes (
0\r\n\r\nG) as the body.POSTbecomes part of the next request with a different method.
If a front-end proxy uses one interpretation and a back-end server uses another, the attacker controls where one request ends and the next begins. This can:
- Bypass access controls — smuggle a request to an internal endpoint.
- Poison caches — make the cache store an attacker-controlled response for a victim’s URL.
- Hijack connections — capture another user’s request.
How Servers Should Handle It
Strict servers should:
- Reject messages with both
Content-LengthandTransfer-Encodingwith a 400 response. - If not rejecting, always prioritize
Transfer-Encodingand ignoreContent-Length. - Never trust
Content-LengthwhenTransfer-Encodingis present.
This is one of the most critical compliance checks that Http11Probe performs.
Transfer-Encoding Obfuscation
Attackers may try to hide Transfer-Encoding from one parser while making another recognize it:
Transfer-Encoding: chunked
Transfer-Encoding : chunked
Transfer-Encoding: xchunked
Transfer-Encoding: chunked\r\n (extra space)
Transfer-Encoding:
chunkedEach of these variants exploits differences in how parsers handle:
- Whitespace before the colon (forbidden by RFC 9112 §5.1).
- Unknown transfer coding names.
- Obs-fold (deprecated line folding).
- Leading/trailing whitespace in the value.
Strict, RFC-compliant parsing eliminates these attack surfaces.