Parquet seems unable to save/read a DataFrame containing a tuple. The tuple becomes a list. Is this by design or a bug? lists and dicts are restored as expected. Pickle will save/read a tuple as expected. The example below saves a dataframe consisting of a single tuple. When read back, it's a list.
import pandas as pd
df = pd.DataFrame([[(0,1)]], columns=['tuple'])
print(df)
df.to_parquet('t')
df2 = pd.read_parquet('t', engine='pyarrow')
print(df2)
CodePudding user response:
I have used parquet files for some time now but for some reasons I didnt have a df with tuples.
I tested that with the following (I think, thats what you experienced as well). At the time of saving df in the snip below, column1 is a tuple
When I read though, I get the column1 as a list