Get header of a HTTP payload-CodePudding

I got a piece of code that is supposed to decode the pcap files to write the images in a directory. I capture packets via wireshark, browse on http websites to get images, and place them in a directory.


def get_header(payload):
    try :
        header_raw = payload[:payload.index(b'\r\n\r\n') 2]
    except ValueError:
        sys.stdout.write('-')
        sys.stdout.flush()
        return None

    header = dict(re.findall(r'(?P<name>.*?): (?P<value>.*?)\r\n', header_raw.decode())) 
    # This line of code is supposed to split out the headers

    if 'Content-Type' not in header:
        return None
    return header

When I try to run it, it gives me this :

Traceback (most recent call last):
  File "/home/kali/Documents/Programs/Python/recapper.py", line 79, in <module>
    recapper.get_responses()
  File "/home/kali/Documents/Programs/Python/recapper.py", line 62, in get_responses
    header = get_header(payload)
  File "/home/kali/Documents/Programs/Python/recapper.py", line 24, in get_header
    header = dict(re.findall(r"(?P<name>.*?): (?P<value>.*?)\r\n", header_raw.decode()))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc6 in position 1: invalid continuation byte

I tried different things, but can't get it right. Can anyone more experienced than me tell what is the problem or how can I split out the header if i'm doing it wrong ?

CodePudding user response：

Update : I found that the encoding I had to use wasn't utf-8, but ISO-8859-1

Like this: header = dict(re.findall(r'(?P<name>.*?): (?P<value>.*?)\r\n', header_raw.decode(ISO-8859-1))), and it works !