I have a column in pandas dataframe that has the following structure (see example). I think I have a nested dictionary in a single column, and I want each key to have it's own column. I want all the matching keys to be the same column. Run the examples for more details
import pandas as pd
import numpy as np
data = ["{'age': 59, 'gender': 'Female','pain': 'No', 'date': '2022-09-29'}",
"{'gender': 'Male', 'date': '2022-10-11'}",
"{'age': 18, 'date': '2022-10-11', 'pain': 'No'}",
]
original_df = pd.DataFrame(data, columns=['test'])
original_df.head()
I want to explode the dataframe so that it has the following structure:
data = [[59,'Female','No','2022-09-29'], [np.nan,'Male',np.nan,'2022-10-11'], [18,np.nan,'2022-10-11','No']]
# Create the pandas DataFrame
desired_df = pd.DataFrame(data, columns=['age','gender','date','pain'])
desired_df.head()
CodePudding user response:
Using json.normalize
from ast import literal_eval
import pandas as pd
original_df["test"] = original_df["test"].apply(literal_eval)
df = original_df.join(pd.json_normalize(original_df.pop("test")))
print(df)
age gender pain date
0 59.0 Female No 2022-09-29
1 NaN Male NaN 2022-10-11
2 18.0 NaN No 2022-10-11