Home > Enterprise >  TypeError: a bytes-like object is required, not 'str' when splitting lines from .gz file
TypeError: a bytes-like object is required, not 'str' when splitting lines from .gz file

Time:12-22

I have a gz file sample.gz.

This is first line of sample gz file.
This is second line of sample gz file.

I read this .gz file and then split it line by line. Once I have individual lines I further split it into parts with whitespace as separator.

import gzip
logfile = "sample.gz"
with gzip.open(logfile) as page:
    for line in page:
        string = line.split(" ")
        print(*string, sep = ',')

I am expecting output like

This,is,first,line,of,sample,gz,file.
This,is,second,line,of,sample,gz,file.

But insted of the above result, I am receiving TypeError:

TypeError: a bytes-like object is required, not 'str'

Why is the split function not working as it is supposed to?

CodePudding user response:

If you guys see the comments above there are couple of approaches which can be used. I followed Python read csv line from gzipped file suggested by mkrieger1 and came up with below solution.

import gzip
logfile = "sample.gz"
with gzip.open(logfile) as page:
    for line in page:
        string = line.decode('utf-8').split(' ')
        print(*string, sep = ',')

Thanks for quick response here.

CodePudding user response:

By default, gzip.open opens files in binary mode. This means that reading returns bytes objects, and bytes objects can only be split on other bytes objects, not on strings.

If you want strings, use the mode and encoding arguments to gzip.open:

with gzip.open(logfile, 'rt', encoding='utf-8') as page:
    ...
  • Related