Home > Net >  check specific values in a dataframe and make the sum
check specific values in a dataframe and make the sum

Time:02-22

I want to make the sum of each 'Group' which have at least one 'Customer' with an 'Active' Bail.

Sample Input :


    Customer ID  Group     Bail       Amount
0    23453       NAFNAF    Active     200
1    23849       LINDT     Active     350
2    23847       NAFNAF    Inactive   100
3    84759       CARROUF   Inactive   20

For example 'NAFNAF' has 2 customers, including one with an active bail.

Output expected :

NAFNAF : 300
LINDT : 350
TOTAL ACTIVE: 650

I don't wanna change the original dataframe

CodePudding user response:

You can use:

(df.assign(Bail=df.Bail.eq('Active'))
   .groupby('Group')[['Bail', 'Amount']].agg('sum')
   .loc[lambda d: d['Bail'].ge(1), ['Amount']]
)

output:

        Amount
Group         
LINDT      350
NAFNAF     300

Full output with total:

df2 = (
 df.assign(Bail=df.Bail.eq('Active'))
   .groupby('Group')[['Bail', 'Amount']].agg('sum')
   .loc[lambda d: d['Bail'].ge(1), ['Amount']]
)

df2 = pd.concat([df2, df2.sum().to_frame('TOTAL').T])

output:

        Amount
LINDT      350
NAFNAF     300
TOTAL      650

CodePudding user response:

Create a boolean mask of Group with at least one active lease:

m = df['Group'].isin(df.loc[df['Bail'].eq('Active'), 'Group'])
out = df[m]

At this point, your filtered dataframe looks like:

>>> out
   Customer ID   Group      Bail  Amount
0        23453  NAFNAF    Active     200
1        23849   LINDT    Active     350
2        23847  NAFNAF  Inactive     100

Now you can use groupby and sum:

out = df[m].groupby('Group')['Amount'].sum()
out = pd.concat([out, pd.Series(out.sum(), index=['TOTAL ACTIVE'])])

# Output
LINDT           350
NAFNAF          300
TOTAL ACTIVE    650
dtype: int64
  • Related