Home > Software engineering >  Parquet Converts DataFrame Tuples to Lists
Parquet Converts DataFrame Tuples to Lists

Time:12-30

Parquet seems unable to save/read a DataFrame containing a tuple. The tuple becomes a list. Is this by design or a bug? lists and dicts are restored as expected. Pickle will save/read a tuple as expected. The example below saves a dataframe consisting of a single tuple. When read back, it's a list.

import pandas as pd
df = pd.DataFrame([[(0,1)]], columns=['tuple'])
print(df)
df.to_parquet('t')
df2 = pd.read_parquet('t', engine='pyarrow')
print(df2)

CodePudding user response:

I have used parquet files for some time now but for some reasons I didnt have a df with tuples.

From this enter image description here

I tested that with the following (I think, thats what you experienced as well). At the time of saving df in the snip below, column1 is a tuple

enter image description here

When I read though, I get the column1 as a list

enter image description here

  • Related