I have a pandas series containing a list of dictionaries. I'd like to parse the contents of the dicts with some condition and store the results into new columns.
Here's some data to work with:
import pandas as pd
df = pd.DataFrame({'d': [[{'br': 1, 'ba': 1, 'r': 100},
{'ba': 1, 'r': 80},
{'br': 2, 'ba': 1, 'r': 150},
{'br': 1, 'ba': 1, 'r': 90}],
[{'br': 1, 'ba': 1, 'r': 100},
{'ba': 1, 'r': 80},
{'br': 2, 'ba': 1, 'r': 150}]],
'id': ['xxas', 'yxas'],
'name': [A, B]
})
I'd like to parse the contents of each dictionary with some conditional logic. Check for each dicts in the list and name columns as keys br
and ba
. Get the value
of r
key
as assign as column value. If key is not found, br
in this example, assign 0
as the value. Expected output:
id br ba r name
xxas 1 0 100 A
xxas 0 1 80 A
xxas 2 1 150 A
xxas 1 1 90 A
yxas 1 1 100 B
yxas 0 1 80 B
yxas 2 1 150 B
CodePudding user response:
Try:
df = pd.concat([df.pop("d").explode().apply(pd.Series).fillna(0), df], axis=1)
print(df[["id", "br", "ba", "r", "name"]].astype(int, errors="ignore"))
Prints:
id br ba r name
0 xxas 1 1 100 A
0 xxas 0 1 80 A
0 xxas 2 1 150 A
0 xxas 1 1 90 A
1 yxas 1 1 100 B
1 yxas 0 1 80 B
1 yxas 2 1 150 B
CodePudding user response:
explode
approach as a pipeline:
df2 = (df
.explode('d', ignore_index=True)
.pipe(lambda d: d.drop(columns='d')
.join(d['d'].apply(pd.Series)
.fillna(0, downcast='infer')))
)
Output:
id name br ba r
0 xxas A 1 1 100
1 xxas A 0 1 80
2 xxas A 2 1 150
3 xxas A 1 1 90
4 yxas B 1 1 100
5 yxas B 0 1 80
6 yxas B 2 1 150
Alternatively using concat
and a dictionary comprehension:
df2 = (pd
.concat({k: pd.DataFrame(v)
for k,v in df['d'].iteritems()})
.fillna(0, downcast='infer')
.droplevel(1)
.join(df.drop(columns='d'))
)
Output:
br ba r id name
0 1 1 100 xxas A
0 0 1 80 xxas A
0 2 1 150 xxas A
0 1 1 90 xxas A
1 1 1 100 yxas B
1 0 1 80 yxas B
1 2 1 150 yxas B