Home > Back-end >  How to Convert Python List with Bytes to Pandas DataFrame?
How to Convert Python List with Bytes to Pandas DataFrame?

Time:02-02

The input is given as: rec = [b'1674278797,14.33681', b'1674278798,6.03617', b'1674278799,12.78418'] I want to get a DataFrame like:

df
    timestamp       val
0  1674278797  14.33681
1  1674278798   6.03617
2  1674278799  12.78418

What is the most efficient way? Thanks!

If I can convert rec like [[1674278797,14.33681], [1674278798,6.03617], [1674278799,12.78418]] It would be easy for me by calling df = pd.DataFrame(rec, columns=['timestamp','val']) But I don't know how to do the conversion quickly.

btw, I got rec from a Redis list. I can modify the format of each element (for example, b'1674278797,14.33681' is an element) if necessory.

CodePudding user response:

If you can't directly handle the original input, you can use:

(pd.Series([x.decode('utf-8') for x in rec])
   .str.split(',', expand=True).convert_dtypes()
   .set_axis(['timestamp', 'val'], axis=1)
)

Or:

import io

pd.read_csv(io.StringIO('\n'.join([x.decode('utf-8') for x in rec])),
            header=None, names=['timestamp', 'val'])

Output:

    timestamp       val
0  1674278797  14.33681
1  1674278798   6.03617
2  1674278799  12.78418

CodePudding user response:

You can do this in one line:

pd.DataFrame([x.decode().split(",") for x in rec], columns=["timestamp","val"])

Returns

    timestamp       val
0  1674278797  14.33681
1  1674278798   6.03617
2  1674278799  12.78418

If you want to convert the datatypes of the column you can add .astype({"timestamp": "int64", "val": "float64"}) to the end of the line.

  • Related