Home > front end >  How to sum a single column based on multiple conditions in Python?
How to sum a single column based on multiple conditions in Python?

Time:12-08

The end goal is to sum the values in the "Price" column based on multiple conditions in other columns. I am unsure how to do it though.

import pandas as pd

#creating random dataset
rnd = pd.DataFrame({'Status':['Accepted', 'Denied', 'Accepted', 'Accepted', 'Denied'],
                    'Letter':['A 02-02-19', 'AB 10-31-21', 'A 03-07-18', 'CD 11-13-21', 'A 04-05-21'], 
                    'Fruit':['Apple', 'Orange', 'Blueberry', 'Orange', 'Apple'],
                    'Price':[10,20,14,15,29]})

#output of dataframe
rnd
  1. Needs to have "Accepted" as a value in the "Status" Column. I know this can be done by doing

    ''' net = rnd.loc(rnd["Status"] == "Accepted", "Price"].sum() '''

However, I also needed it to be summed based on what occurs in the "Letter" Column. I don't care about the random dates in the value, only the characters in the beginning of the string. AB would be grouped in a different Group than A, which would also be grouped differently than CD. I am trying to figure out how to sum "Price" if I only wanted those who had "A" in the Letter column and "Accepted" in the Status column.

CodePudding user response:

Try:

rnd.query('Status == "Accepted"')\
   .groupby(rnd['Letter'].str.split(' ').str[0])['Price'].sum()

Output:

Letter
A     24
CD    15
Name: Price, dtype: int64

CodePudding user response:

In general:

foo = self.db[(self.db['COL_NAME'] == val1) & (self.db['OTHER_COL'] != 0)]['COL_TO_SUM'].sum()
  • Related