I have a dataframe of error codes, numerators, and their denominators as well as a dictionary of each possible error code and its overarching error type. I'd like to group the error codes in the dataframe by the error type, sum their numerators, and keep the column of denominators as is. I've tried
grouped = day1_xione.set_index('error_code').groupby(by=map_values,axis=1).sum().reset_index()
grouped
but it only keeps the error codes and doesn't group them by type. This is what a sample of the data looks like:
Dataframe:
error_code | Numerator | Denominator |
---|---|---|
POWER CYCLE ON | 5 | 10 |
POWER OFF | 1 | 10 |
PANIC - PANIC -PANIC_1 | 5 | 10 |
TEMP_HIGH-TEMP | 0 | 10 |
UNKNOWN-Reason_Unknown | 9 | 10 |
for as many error codes that show up, 56 errors. Each error code is a string that directly matches a code in the dictionary.
Dictionary:
map_values =
{'POWER CYCLE ON' : 'POWER RESET',
'POWER OFF': 'POWER RESET',
'PANIC - PANIC -PANIC_1': 'DEVICE PANIC',
'TEMP_HIGH-TEMP': 'OTHER',
'UNKNOWN-Reason_Unknown': 'OTHER'}
for 57 codes and 10 types.
Desired Output:
error_code | Numerator | Denominator |
---|---|---|
POWER RESET | 6 | 10 |
DEVICE PANIC | 5 | 10 |
OTHER | 9 | 10 |
for all of the error codes that show up
CodePudding user response:
You were close, you need to groupby
the mapped values using map
and use agg
for aggregation with different functions (sum
and first
)
(df.groupby(df['error_code'].map(map_values))
.agg({'Numerator': 'sum', 'Denominator': 'first'})
.reset_index()
)
output:
error_code Numerator Denominator
0 DEVICE PANIC 5 10
1 OTHER 9 10
2 POWER RESET 6 10