Home > Net >  Pandas to_records keep turning my bytes column to str
Pandas to_records keep turning my bytes column to str

Time:05-21

I am trying to read in a csv as a pandas dataframe and turn it into a list of tuples which I am currently doing using to_records(). However, one of my column which is bytes keep getting turned into a string, for example, goes from b'\x00\x01\x02\x03\x04' to "b'\x00\x01\x02\x03\x04'". I want pandas to keep the original formatting, is there any way to achieve this?

This is what the dataframe looks like:

col1 col2
True b'\\x00\\x01\\x02\\x03\\x04'
False b'\\x05\\x06\\x07\\x08\\x09'

But when I turn it into a list of tuples using to_records it looks like this: [(True, "b'\\x00\\x01\\x02\\x03\\x04'"), (False, "b'\\x05\\x06\\x07\\x08\\x09'")]

^ As you can see the bytes get turned into a string.

CodePudding user response:

As @mozway suggests, your values are already strings look like bytes. Try to use pd.eval:

>>> df.assign(col2=pd.eval(df['col2'])).to_records()
rec.array([(0,  True, b'\\x00\\x01\\x02\\x03\\x04'),
           (1, False, b'\\x05\\x06\\x07\\x08\\x09')],
          dtype=[('index', '<i8'), ('col1', '?'), ('col2', 'O')])


>>> df.to_records()
rec.array([(0,  True, "b'\\\\x00\\\\x01\\\\x02\\\\x03\\\\x04'"),
           (1, False, "b'\\\\x05\\\\x06\\\\x07\\\\x08\\\\x09'")],
          dtype=[('index', '<i8'), ('col1', '?'), ('col2', 'O')])
  • Related