TRANSFER_ENCODING
| Test ID | SMUG-TRANSFER_ENCODING |
| Category | Smuggling |
| RFC | RFC 9112 Section 6.1 |
| Requirement | Unscored |
| Expected | 400 or 2xx |
What it sends
Transfer_Encoding: chunked (underscore instead of hyphen) with Content-Length: 5.
POST / HTTP/1.1\r\n
Host: localhost:8080\r\n
Transfer_Encoding: chunked\r\n
Content-Length: 5\r\n
\r\n
helloNote Transfer_Encoding with an underscore instead of a hyphen.
What the RFC says
“field-name = token” – RFC 9110 Section 5.1
“token = 1*tchar” where “tchar = ‘!’ / ‘#’ / ‘$’ / ‘%’ / ‘&’ / ’’’ / ‘*’ / ‘+’ / ‘-’ / ‘.’ / ‘^’ / ‘_’ / ‘`’ / ‘|’ / ‘~’ / DIGIT / ALPHA” – RFC 9110 Section 5.6.2
The underscore character (_) is explicitly included in the tchar production, making Transfer_Encoding a syntactically valid field name (token). However, it is not the registered Transfer-Encoding header field and has no defined semantics. A server receiving this header should treat it as an unknown custom header, not as Transfer-Encoding.
Why this test is unscored
Transfer_Encoding is a valid token but not the Transfer-Encoding header. The server is correct to ignore it as an unknown header (resulting in 2xx using Content-Length for framing) or to reject it with 400 (strict policy). The test is unscored because neither response is wrong per the RFC.
Pass: Server rejects with 400 (strict, safe).
Warn: Server accepts and responds 2xx (treats it as unknown header, uses CL).
Why it matters
Some proxies normalize underscores to hyphens (notably certain Python/Ruby frameworks like Gunicorn and WEBrick), making this a known smuggling vector. If a front-end passes Transfer_Encoding through as-is but a back-end normalizes the underscore to a hyphen and processes it as Transfer-Encoding: chunked, the two parsers disagree on message framing.
Deep Analysis
ABNF
field-line = field-name ":" OWS field-value OWS ; RFC 9112 §5
field-name = token ; RFC 9110 §5.1
token = 1*tchar ; RFC 9110 §5.6.2
tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*"
/ "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
/ DIGIT / ALPHAThe underscore (_) is explicitly listed as a valid tchar character. Therefore, Transfer_Encoding is a syntactically valid token and a valid field-name. However, HTTP header field names are matched by their registered names, and the registered name is Transfer-Encoding (with a hyphen). Transfer_Encoding is a completely different, unregistered header field name.
RFC Evidence
“Each field line consists of a case-insensitive field name followed by a colon (’:’), optional leading whitespace, the field line value, and optional trailing whitespace.” – RFC 9112 §5
“A server MAY reject a request that contains both Content-Length and Transfer-Encoding or process such a request in accordance with the Transfer-Encoding alone. Regardless, the server MUST close the connection after responding to such a request to avoid the potential attacks.” – RFC 9112 §6.1
“If a message is received with both a Transfer-Encoding and a Content-Length header field, the Transfer-Encoding overrides the Content-Length. Such a message might indicate an attempt to perform request smuggling or response splitting and ought to be handled as an error.” – RFC 9112 §6.3
Chain of Reasoning
- The test sends
Transfer_Encoding: chunked(underscore) alongsideContent-Length: 5. - The header field name
Transfer_Encodingis syntactically valid – the underscore is a permittedtchar. However, it is not the registeredTransfer-Encodingheader. - A compliant server should treat
Transfer_Encodingas an unknown/custom header with no defined semantics. Since the actualTransfer-Encodingheader is absent, the server should useContent-Length: 5for framing and respond normally with2xx. - The danger arises from server or proxy implementations that normalize header field names. Some frameworks (notably Python’s WSGI/CGI interface, Ruby’s WEBrick, and Gunicorn) convert header names to uppercase with underscores replacing hyphens (
HTTP_TRANSFER_ENCODING). If a back-end framework reverses this normalization and converts underscores back to hyphens,Transfer_EncodingbecomesTransfer-Encoding. - If the back-end sees
Transfer-Encoding: chunked(after normalization) while the front-end sawTransfer_Encoding(an unknown header) and used Content-Length, the two parsers disagree on message framing. - Since
Transfer_Encodingis notTransfer-Encoding, the CL/TE dual-header rules in RFC 9112 section 6.3 do not technically apply to a compliant server. The server simply hasContent-Length: 5and an unknown header.
Scored / Unscored Justification
This test is unscored because Transfer_Encoding is a syntactically valid but unregistered header name. The RFC does not mandate any specific behavior for unknown headers beyond ignoring them. A server that treats it as unknown and uses Content-Length is correct. A server that rejects with 400 is being defensively strict. Neither behavior violates the specification.
- Pass (400): Strict rejection – the server flags the suspicious header name.
- Warn (2xx): Correct behavior – the server treated
Transfer_Encodingas an unknown header and used Content-Length.
Smuggling Attack Scenarios
- Underscore-to-Hyphen Normalization Desync: A front-end proxy passes
Transfer_Encoding: chunkedthrough as-is (an unknown header). The back-end, running a framework that normalizes underscores to hyphens (e.g., Gunicorn behind Nginx), seesTransfer-Encoding: chunkedand uses chunked framing. The front-end used Content-Length; the back-end uses chunked encoding. The attacker injects a second request in the body that only the back-end parses. - CGI/WSGI Environment Variable Poisoning: In CGI and WSGI environments, headers are converted to environment variables like
HTTP_TRANSFER_ENCODING. Some reverse mappings do not distinguish betweenTransfer-Encoding(original) andTransfer_Encoding(underscore variant) because both map to the same environment variable. An attacker can use the underscore variant to inject a Transfer-Encoding header that the front-end never intended to forward. - Double Header Injection: An attacker sends both
Transfer-Encoding: chunkedandTransfer_Encoding: identity. A front-end proxy processes the legitimateTransfer-Encoding: chunked. A back-end that normalizes underscores ends up with twoTransfer-Encodingheaders with conflicting values, creating an additional layer of ambiguity in transfer coding selection.