Home > Back-end >  Calculate sum of donation money based on value in other column in pandas
Calculate sum of donation money based on value in other column in pandas

Time:10-27

I'm trying to calculate the monetary sum of campaign contribution data from the real estate industry.

realestate_counter = 0
realestate_donations = 0
for row in range(df.shape[0]): #for each row in the dataframe
    if 'realestate' in df.iloc[row]['Occupation']:
        realestate_donations = realestate_donations   int(row)
        print(df.iloc[row]['Amount'])
        realestate_counter = realestate_counter   1
#         print('-------')
print('$'   str(realestate_donations)  ' was doanted from the real estate industry.')
print('There are '   str(realestate_counter)  ' Real Estate donors.')

My dataframe has a variety of occupations, but I'm only interested in real estate. The code block prints the donation value where the occupation is realestate and attempts to sum it.

I take the numbers printed by the print(df.iloc[row]['Amount]) and paste them into a Google sheet and sum there.

That amount vastly differes from the amount this line of code calculates ealestate_donations = realestate_donations int(row)

Google sheets: 72200

My code: 23973

I suspect my code is failing somewhere and the Google sheets is accurate.

I've copied the entire printed output below:

500.0
200.0
1000.0
1000.0
1000.0
1000.0
100.0
50.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
-1000.0
1000.0
-1000.0
-1000.0
1000.0
1000.0
1000.0
1000.0
500.0
300.0
100.0
1000.0
1000.0
1000.0
1000.0
100.0
250.0
100.0
250.0
250.0
500.0
1000.0
$23973 was doanted from the real estate industry.
There are 88 Real Estate donors.

CodePudding user response:

This is a slimmed down example but should get you what you want

df = pd.DataFrame({'occupation':['real estate','real estate','real estate','banker','baker'], 'donation':range(11,16)})

    occupation  donation
0  real estate        11
1  real estate        12
2  real estate        13
3       banker        14
4        baker        15

# filter on occupation
df[df['occupation']=='real estate']

    occupation  donation
0  real estate        11
1  real estate        12
2  real estate        13

# sum of all donations
df[df['occupation']=='real estate']['donation'].sum()

36

# count of all donations
df[df['occupation']=='real estate']['donation'].count()

3

Groupby comparisons

df.groupby('occupation').sum()

             donation
occupation
baker              15
banker             14
real estate        36

df.groupby('occupation').count()

             donation
occupation
baker               1
banker              1
real estate         3
  • Related