Create function to loop through a pandas dataframe-CodePudding

I have a dataframe that looks like this.

Now i want to create a function to get the sum of the 'Total' column for each of the name 'ABC','XYZ'... with a pre-defined condition of the Currency. In other words, i want my output to be the same as the following

Name_list = list(df['Name'])
Currency_list = list(df['Currency'])

def Calculation(a,b):
    for a in Name_list:
        for b in Currency_list:
            return df['Total'].sum()

result = Calculation('ABC','VND')
print(result)

So far what I have got is just the sum of the whole column 'Total'. Anyone has ideas how to get the desired result for this?

CodePudding user response：

I think pandas' groupby method may help you here:

df.groupby(['Currency', 'Name'])['Total'].sum()

Link to official docs: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html

CodePudding user response：

In your code there is actually nothing that force the function to use only the lines you want to select in your dataframe. This can be done by using the loc function to locate the lines you want to sum on.

In your example, if you replace le the line return df['Total'].sum() by this one : return df.loc[(df['Name'] == a) & (df['Currency'] == b), 'Total').sum() It should return what you want.