Home > database >  Group rows of dataframe according to dictionary in pandas and sum corresponding numerators
Group rows of dataframe according to dictionary in pandas and sum corresponding numerators

Time:11-06

I have a dataframe of error codes, numerators, and their denominators as well as a dictionary of each possible error code and its overarching error type. I'd like to group the error codes in the dataframe by the error type, sum their numerators, and keep the column of denominators as is. I've tried

grouped = day1_xione.set_index('error_code').groupby(by=map_values,axis=1).sum().reset_index()
grouped

but it only keeps the error codes and doesn't group them by type. This is what a sample of the data looks like:

Dataframe:

error_code Numerator Denominator
POWER CYCLE ON 5 10
POWER OFF 1 10
PANIC - PANIC -PANIC_1 5 10
TEMP_HIGH-TEMP 0 10
UNKNOWN-Reason_Unknown 9 10

for as many error codes that show up, 56 errors. Each error code is a string that directly matches a code in the dictionary.

Dictionary:

map_values = 
{'POWER CYCLE ON' : 'POWER RESET',
'POWER OFF': 'POWER RESET',
'PANIC - PANIC -PANIC_1': 'DEVICE PANIC',
'TEMP_HIGH-TEMP': 'OTHER',
'UNKNOWN-Reason_Unknown': 'OTHER'}

for 57 codes and 10 types.

Desired Output:

error_code Numerator Denominator
POWER RESET 6 10
DEVICE PANIC 5 10
OTHER 9 10

for all of the error codes that show up

CodePudding user response:

You were close, you need to groupby the mapped values using map and use agg for aggregation with different functions (sum and first)

(df.groupby(df['error_code'].map(map_values))
   .agg({'Numerator': 'sum', 'Denominator': 'first'})
   .reset_index()
)

output:

     error_code  Numerator  Denominator
0  DEVICE PANIC          5           10
1         OTHER          9           10
2   POWER RESET          6           10
  • Related