Home > other >  How can I get the sum of one column based on year, which is stored in another column?
How can I get the sum of one column based on year, which is stored in another column?

Time:12-15

I have this code.

cheese_sums = []

for year in milk_products.groupby(milk_products['Date']):
    total = milk_products[milk_products['Date'] == year]['Cheddar Cheese Production (Thousand Tonnes)'].sum()
    cheese_sums.append(total)
    
print(cheese_sums)

I am trying to sum all the Cheddar Cheese Production, which are stored as floats in the milk_products data frame. The Date column is a datetime object that holds only the year, but has 12 values representing each month. As it's written now, I can only print a list of six 0.0's.

CodePudding user response:

I got it. It should be:

cheese_sums = []

for year in milk_products['Date']:
    total = milk_products[milk_products['Date'] == year]['Cheddar Cheese Production (Thousand Tonnes)'].sum()
    if total not in cheese_sums:
        cheese_sums.append(total)
    
print(cheese_sums)

CodePudding user response:

You seem to think too complicated. Try groupby(...).sum()

df = milk_products.groupby('Date').sum()
  • Related