In my pandas dataframe, I have a column formatted like a dictionary:
What I want to do is extract data from this column and add two columns like this:
In other words, I want to separate values between ":".
I wonder if there is an easy way to do this?
CodePudding user response:
Assuming all values in df.column1
are JSON-formatted strings containing a single key:
- Parse each row as JSON.
- Convert each row to a
(key, value)
tuple. - Create new DataFrame with 2 columns from the list of 2-tuples.
import json
import pandas as pd
df = pd.DataFrame(
['{"x": 1}', '{"y": 2}', '{"z": 3}'],
columns=["column1"],
)
df2 = pd.DataFrame(
df.column1.map(lambda x: json.loads(x).popitem()).tolist(),
columns=["column2", "column3"],
)
print(df)
print(df2)
Output
column1
0 {"x": 1}
1 {"y": 2}
2 {"z": 3}
column2 column3
0 x 1
1 y 2
2 z 3
CodePudding user response:
If 'column1' holds only one key: value per dictionary, then you can add new columns by calling the items method and using the first tuple:
df[['column2', 'column3']] = pd.DataFrame(df['column1'].apply(lambda x: list(x.items())[0]).tolist(), index=df.index)