I am currently in the process of learning Pandas and I am stuck at an exercise.
My problem is as follows:
I need to calculate the number of total purchases by a cardholder in a month of more than $25000. For this I have been given a hint that I first must create a DataFrame that includes the total purchases by each cardholder in each month (using the groupby()
and sum()
methods), and then use this DataFrame to do the calculation.)
df_3= df[['Calendar Month','Cardholder Name']].groupby(df['Amount']).sum()
df_3
Which gives the following output
Amount Calendar Month Cardholder Name
-5924.00 5 JEFFRIES, T
-5522.21 11 LAIR, M
-4800.00 11 KENT, D
-4444.23 6 LAIR, M
-4364.50 2 FOISY, J
... ... ...
20876.68 3 JEFFRIES, T
27087.50 12 COLE, J
29585.16 2 JEFFRIES, T
57510.00 1 LACEY, L
62181.77 4 LACEY, L
I think that this initial Dataframe (df_3) is right, but I don't know how the actual calculation must be done to calculate the number of total purchases by a cardholder in a month of more than $25000.
CodePudding user response:
here is one way to do it. you store the groupby result in a var and then filter it on your condition.
gb=df.groupby(['Month','Cardholder Name'])['amount'].sum().to_frame()
gb[gb['amount'] > 25000]
amount
Month Cardholder Name
1 LACEY, L 57510.00
2 JEFFRIES, T 29585.16
4 LACEY, L 62181.77
12 COLE, J 27087.50