What is maximum length of each line in each HTTP request and headers that can come which should be v-CodePudding

I have this HTTP header that came to my C server include include request and headers.

"GET / HTTP/1.1\r\n" /
"Host: localhost:5000\r\n" /
"User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:93.0) Gecko/20100101 Firefox/93.0\r\n" /
"Accept: text/html,application/xhtml xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8\r\n" /
"Accept-Language: en-US,en;q=0.5\r\n" /
"Accept-Encoding: gzip, deflate\r\n" /
"Connection: keep-alive\r\n" /
"Upgrade-Insecure-Requests: 1\r\n" /
"Sec-Fetch-Dest: document\r\n" /
"Sec-Fetch-Mode: navigate\r\n" /
"Sec-Fetch-Site: none\r\n" /
"Sec-Fetch-User: ?1\r\n" /

I like to know what the maximum size is of each line until \n according to HTTP protocol which should be valid so I can try parsing it easily.

"GET / HTTP/1.1\r\n" /
"Host: localhost:5000\r\n" /
"User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:93.0) Gecko/20100101 Firefox/93.0\r\n" /
"Sec-Fetch-User: ?1"

CodePudding user response：

The short answer is that there is no such restriction. The web server may have a limit on the total size of the HTTP request headers it will accept.

Apache : 8K
IIS : 8K - 16K
nginx : 4K - 8K
Tomcat : 8K - 16K

You could test the limit of your server by sending requests with junk headers and checking the response status code. A 400, 413 or 431 response would be an indication that you reached the limit.

Further investigation:

I checked out the response codes from big name domains and there is no consistency in the response code received.

For example:

www.apple.com, Header size: 9K, Response Code: 400
www.google.com, Header size: 15K, Response Code: 413
www.ibm.com, Header size: 9K, Response Code: 404
www.netflix.com, Header size: 14K, Response Code: 500

Here's the Python script I wrote for the task:

#!/usr/bin/env python3 
import os
import random
import requests
import string
import sys

def header_size_test(url, limit=64):
    print(f'Checking "{url}".')
    # Generate a large enough string of ASCII characters
    size_limit = limit * 1000
    junk_string = ''.join(random.choice(string.ascii_letters   string.digits) for _ in range(size_limit))
    for size in range(4, (limit 1)):
        N = size * 1000
        # Set the junk header of the required size
        headers = {'junk': junk_string[:N]}
        # Send the request
        response = requests.get(url, headers)
        # Display the response
        print(f'Header size: {size}K, Response Code: {response.status_code}')
        # No point in continuing if the server response is not OK (200)
        if (response.status_code != 200):
            break

if __name__ == '__main__':
    if len(sys.argv) > 1:
        header_size_test(sys.argv[1])
    else:
        print(f'Please provide a url string.\n\tExample: {os.path.basename(__file__)} "http://www.google.com"')

Implementing in your server

HTTP GET request headers end with \r\n\r\n so you can search your buffer for this substring. My knowledge of C is a bit rusty so please forgive any syntax errors.

#define MAX_HEADER_LENGTH 4000
char *marker = strstr(buffer, "\r\n\r\n");
// end of header found?
if (marker != NULL) {
  // find the length of the headers in buffer
  headers_length = &marker - &buffer;
  if (MAX_HEADER_LENGTH < headers_length) {
    // send a nice response
  }
}

If you need to parse the content of the headers then refer to the RFC 2616 document.

CodePudding user response：

There seems to be no protocol-level limit. Existing limits are implementation specific. Which means that you parse as much as you can and when you can't anymore you respond 431 Request Header Fields Too Large

Quote:

431 can be used when the total size of request headers is too large, or when a single header field is too large