Home > Net >  HTTP Header value allowed characters
HTTP Header value allowed characters

Time:10-14

I have checked RFC for HTTP header value allowed characters. I am unable to understand the whole thing can any please describe more. RFC reference https://datatracker.ietf.org/doc/html/rfc7230#section-3.2

 field-value    = *( field-content / obs-fold )
 field-content  = field-vchar [ 1*( SP / HTAB ) field-vchar ]
 field-vchar    = VCHAR / obs-text

 obs-fold       = CRLF 1*( SP / HTAB )
                ; obsolete line folding
                ; see Section 3.2.4

CodePudding user response:

It takes a bit of getting used to, reading this Augment BNF syntax in these RFCs, but let’s step through it:

field-value = *( field-content / obs-fold )

This says a value can contain any number of field-content or any obs-fold (a continuation line - we’ll get to that).

field-content is defined as:

field-content = field-vchar [ 1*( SP / HTAB ) field-vchar ]

This is a field-vchar, optionally followed by one or more spaces or tabs, and then a field-vchar.

I’ll spoil the next bit and tell you field-vchar is basically any visible text character (i.e. not including spaces, tabs, new lines or control characters). So what the above tells you is it’s basically any visible character BUT can also include space and tabs, but not at the beginning or end.

field-vchar is defined as:

field-vchar = VCHAR / obs-text

That is a VCHAR (defined further up the RFC as “any visible [USASCII] character”) or obs-text (characters in the %x80-FF range - basically extra printable characters outside of the basic ASCII characters).

Then we go all the way back up to the first statement to define the last item: obs-fold we get:

obs-fold       = CRLF 1*( SP / HTAB )
                ; obsolete line folding
                ; see Section 3.2.4

Basically a newline followed by one or more spaces or tabs. This was the a historical way of splitting a header over multiple lines for readability (though the newline and whitespace was effectively ignored and treated as a space so it wasn’t a newline). Support of this was always a bit flakey and many HTTP processors assumed headers would not have a newline in them (even if technically it was allowed). It would be strongly advised not to use this, especially as it’s now been formally deprecated. (Note that some servers - for example Apache - allow new lines in their config for readability but do not send these, so that’s a separate item).

Bringing that all together, a field value can basically contain any visible characters, including tabs or spaces (but cannot start or end with these). They cannot contain newlines or any other non-visible characters (i.e ASCII characters 0-31 - except for tab as discussed - or 127).

  • Related