I have a df something like this
date mon tue wed thu fri sat sun
01-01-2022 2 3 5 7 8 1 0
02-01-2022 3 4 7 6 3 0 4
03-01-2022 4 8 7 9 1 2 5
04-01-2022 5 2 1 1 8 1 2
05-01-2022 6 1 9 3 7 1 1
my task is to find the sum of each column and compare them to each other and return the name of the column that has the highest and lowest sum. So in this case, when I compare sums of each column, I have wed column to be at max sum (29) and sat to be at min sum (5). So my expected output is printing this information:
max number is seen on wed and min number is seen on sat.
can someone please help me with an efficient way of doing this? Much appreciated
CodePudding user response:
You can use set_index
, sum
and agg
:
df.set_index('date').sum().agg(['idxmin', 'idxmax'])
output:
idxmin sat
idxmax wed
dtype: object
As a string:
s = df.set_index('date').sum().agg(['idxmin', 'idxmax'])
print(f"max number is seen on {s['idxmax']} and min number is seen on {s['idxmax']}.")
output:
max number is seen on wed and min number is seen on wed.