I have a gz file sample.gz.
This is first line of sample gz file.
This is second line of sample gz file.
I read this .gz file and then split it line by line. Once I have individual lines I further split it into parts with whitespace as separator.
import gzip
logfile = "sample.gz"
with gzip.open(logfile) as page:
for line in page:
string = line.split(" ")
print(*string, sep = ',')
I am expecting output like
This,is,first,line,of,sample,gz,file.
This,is,second,line,of,sample,gz,file.
But insted of the above result, I am receiving TypeError:
TypeError: a bytes-like object is required, not 'str'
Why is the split function not working as it is supposed to?
CodePudding user response:
If you guys see the comments above there are couple of approaches which can be used. I followed Python read csv line from gzipped file suggested by mkrieger1 and came up with below solution.
import gzip
logfile = "sample.gz"
with gzip.open(logfile) as page:
for line in page:
string = line.decode('utf-8').split(' ')
print(*string, sep = ',')
Thanks for quick response here.
CodePudding user response:
By default, gzip.open
opens files in binary mode. This means that reading returns bytes
objects, and bytes
objects can only be split on other bytes
objects, not on strings.
If you want strings, use the mode
and encoding
arguments to gzip.open
:
with gzip.open(logfile, 'rt', encoding='utf-8') as page:
...