Home > OS >  Sum multiple rows of dictionaries in a dataframe, based on condition
Sum multiple rows of dictionaries in a dataframe, based on condition

Time:03-21

How can I add the values and keys of multiple dictionaries based on having the same isolate name?

Example dataframe:

Isolate dictionary
VM20030364 {'L': 200, 'V': 500, 'T': 300, 'A': 400, 'S': 1}
VM20030364 {'L': 200, 'V': 600, 'T': 300, 'A': 450}
VM20030364 {'L': 100, 'V': 400, 'T': 300, 'A': 400, 'S': 1}
UNKNOWN-UW-1773 {'L': 500, 'V': 360, 'T': 340, 'A': 300, 'S': 1}
UNKNOWN-UW-1773 {'L': 500, 'V': 340, 'T': 340, 'A': 300, 'S': 2}
UNKNOWN-UW-1773 {'L': 500, 'V': 200, 'T': 350, 'A': 310}

Output dataframe:

Isolate dictionary
VM20030364 {'L': 500, 'V': 1500, 'T': 900, 'A': 1250, 'S': 2}
UNKNOWN-UW-1773 {'L': 1500, 'V': 800, 'T': 1030, 'A': 910, 'S': 3}

CodePudding user response:

Let us map the dictionary column using Counter, then group the dataframe by Isolate and aggregate using sum

from collections import Counter

df['dictionary'].map(Counter).groupby(df['Isolate']).sum().reset_index()

           Isolate                                          dictionary
0  UNKNOWN-UW-1773  {'L': 1500, 'V': 900, 'T': 1030, 'A': 910, 'S': 3}
1       VM20030364  {'L': 500, 'V': 1500, 'T': 900, 'A': 1250, 'S': 2}
  • Related