I'm trying to calculate the monetary sum of campaign contribution data from the real estate industry.
realestate_counter = 0
realestate_donations = 0
for row in range(df.shape[0]): #for each row in the dataframe
if 'realestate' in df.iloc[row]['Occupation']:
realestate_donations = realestate_donations int(row)
print(df.iloc[row]['Amount'])
realestate_counter = realestate_counter 1
# print('-------')
print('$' str(realestate_donations) ' was doanted from the real estate industry.')
print('There are ' str(realestate_counter) ' Real Estate donors.')
My dataframe has a variety of occupations, but I'm only interested in real estate. The code block prints the donation value where the occupation is realestate
and attempts to sum it.
I take the numbers printed by the print(df.iloc[row]['Amount])
and paste them into a Google sheet and sum there.
That amount vastly differes from the amount this line of code calculates ealestate_donations = realestate_donations int(row)
Google sheets: 72200
My code: 23973
I suspect my code is failing somewhere and the Google sheets is accurate.
I've copied the entire printed output below:
500.0
200.0
1000.0
1000.0
1000.0
1000.0
100.0
50.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
1000.0
-1000.0
1000.0
-1000.0
-1000.0
1000.0
1000.0
1000.0
1000.0
500.0
300.0
100.0
1000.0
1000.0
1000.0
1000.0
100.0
250.0
100.0
250.0
250.0
500.0
1000.0
$23973 was doanted from the real estate industry.
There are 88 Real Estate donors.
CodePudding user response:
This is a slimmed down example but should get you what you want
df = pd.DataFrame({'occupation':['real estate','real estate','real estate','banker','baker'], 'donation':range(11,16)})
occupation donation
0 real estate 11
1 real estate 12
2 real estate 13
3 banker 14
4 baker 15
# filter on occupation
df[df['occupation']=='real estate']
occupation donation
0 real estate 11
1 real estate 12
2 real estate 13
# sum of all donations
df[df['occupation']=='real estate']['donation'].sum()
36
# count of all donations
df[df['occupation']=='real estate']['donation'].count()
3
Groupby comparisons
df.groupby('occupation').sum()
donation
occupation
baker 15
banker 14
real estate 36
df.groupby('occupation').count()
donation
occupation
baker 1
banker 1
real estate 3