I have checked RFC for HTTP header value allowed characters. I am unable to understand the whole thing can any please describe more. RFC reference https://datatracker.ietf.org/doc/html/rfc7230#section-3.2
field-value = *( field-content / obs-fold )
field-content = field-vchar [ 1*( SP / HTAB ) field-vchar ]
field-vchar = VCHAR / obs-text
obs-fold = CRLF 1*( SP / HTAB )
; obsolete line folding
; see Section 3.2.4
CodePudding user response:
It takes a bit of getting used to, reading this Augment BNF syntax in these RFCs, but let’s step through it:
field-value = *( field-content / obs-fold )
This says a value can contain any number of field-content
or any obs-fold
(a continuation line - we’ll get to that).
field-content
is defined as:
field-content = field-vchar [ 1*( SP / HTAB ) field-vchar ]
This is a field-vchar
, optionally followed by one or more spaces or tabs, and then a field-vchar
.
I’ll spoil the next bit and tell you field-vchar
is basically any visible text character (i.e. not including spaces, tabs, new lines or control characters). So what the above tells you is it’s basically any visible character BUT can also include space and tabs, but not at the beginning or end.
field-vchar
is defined as:
field-vchar = VCHAR / obs-text
That is a VCHAR
(defined further up the RFC as “any visible [USASCII] character”) or obs-text
(characters in the %x80-FF range - basically extra printable characters outside of the basic ASCII characters).
Then we go all the way back up to the first statement to define the last item: obs-fold
we get:
obs-fold = CRLF 1*( SP / HTAB ) ; obsolete line folding ; see Section 3.2.4
Basically a newline followed by one or more spaces or tabs. This was the a historical way of splitting a header over multiple lines for readability (though the newline and whitespace was effectively ignored and treated as a space so it wasn’t a newline). Support of this was always a bit flakey and many HTTP processors assumed headers would not have a newline in them (even if technically it was allowed). It would be strongly advised not to use this, especially as it’s now been formally deprecated. (Note that some servers - for example Apache - allow new lines in their config for readability but do not send these, so that’s a separate item).
Bringing that all together, a field value can basically contain any visible characters, including tabs or spaces (but cannot start or end with these). They cannot contain newlines or any other non-visible characters (i.e ASCII characters 0-31 - except for tab as discussed - or 127).