I have this HTTP header that came to my C server include include request and headers.
"GET / HTTP/1.1\r\n" /
"Host: localhost:5000\r\n" /
"User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:93.0) Gecko/20100101 Firefox/93.0\r\n" /
"Accept: text/html,application/xhtml xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8\r\n" /
"Accept-Language: en-US,en;q=0.5\r\n" /
"Accept-Encoding: gzip, deflate\r\n" /
"Connection: keep-alive\r\n" /
"Upgrade-Insecure-Requests: 1\r\n" /
"Sec-Fetch-Dest: document\r\n" /
"Sec-Fetch-Mode: navigate\r\n" /
"Sec-Fetch-Site: none\r\n" /
"Sec-Fetch-User: ?1\r\n" /
I like to know what the maximum size is of each line until \n
according to HTTP protocol which should be valid so I can try parsing it easily.
"GET / HTTP/1.1\r\n" /
"Host: localhost:5000\r\n" /
"User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:93.0) Gecko/20100101 Firefox/93.0\r\n" /
"Sec-Fetch-User: ?1"
CodePudding user response:
The short answer is that there is no such restriction. The web server may have a limit on the total size of the HTTP request headers it will accept.
- Apache : 8K
- IIS : 8K - 16K
- nginx : 4K - 8K
- Tomcat : 8K - 16K
You could test the limit of your server by sending requests with junk headers and checking the response status code. A 400, 413 or 431 response would be an indication that you reached the limit.
Further investigation:
I checked out the response codes from big name domains and there is no consistency in the response code received.
For example:
- www.apple.com, Header size: 9K, Response Code: 400
- www.google.com, Header size: 15K, Response Code: 413
- www.ibm.com, Header size: 9K, Response Code: 404
- www.netflix.com, Header size: 14K, Response Code: 500
Here's the Python script I wrote for the task:
#!/usr/bin/env python3
import os
import random
import requests
import string
import sys
def header_size_test(url, limit=64):
print(f'Checking "{url}".')
# Generate a large enough string of ASCII characters
size_limit = limit * 1000
junk_string = ''.join(random.choice(string.ascii_letters string.digits) for _ in range(size_limit))
for size in range(4, (limit 1)):
N = size * 1000
# Set the junk header of the required size
headers = {'junk': junk_string[:N]}
# Send the request
response = requests.get(url, headers)
# Display the response
print(f'Header size: {size}K, Response Code: {response.status_code}')
# No point in continuing if the server response is not OK (200)
if (response.status_code != 200):
break
if __name__ == '__main__':
if len(sys.argv) > 1:
header_size_test(sys.argv[1])
else:
print(f'Please provide a url string.\n\tExample: {os.path.basename(__file__)} "http://www.google.com"')
Implementing in your server
HTTP GET request headers end with \r\n\r\n
so you can search your buffer for this substring. My knowledge of C is a bit rusty so please forgive any syntax errors.
#define MAX_HEADER_LENGTH 4000
char *marker = strstr(buffer, "\r\n\r\n");
// end of header found?
if (marker != NULL) {
// find the length of the headers in buffer
headers_length = &marker - &buffer;
if (MAX_HEADER_LENGTH < headers_length) {
// send a nice response
}
}
If you need to parse the content of the headers then refer to the RFC 2616 document.
CodePudding user response:
There seems to be no protocol-level limit. Existing limits are implementation specific. Which means that you parse as much as you can and when you can't anymore you respond 431 Request Header Fields Too Large
Quote:
431 can be used when the total size of request headers is too large, or when a single header field is too large