I have a dataframe similar to this:
import pandas as pd
data = {'product_name': ['2000', '2001', '2000', '2001', '2002', '2002', '2001', '2000'],
'price': [1200, 150, 300, 450, 200, 300, 450, 200],
'quantity': [50, 15, 30, 450, 20, 30, 40, 27]
}
df = pd.DataFrame(data)
print (df)
For example, I want to calculate the average price and quantity for each year (i.e., 2000, 2001, 2002) and save them to use later. I have tried different ways, but they do not work. I can do one by one but that is not ideal! I wonder if there is a way to write a loop to do it. Pick column 'product_name'; pick year (say 2000), then pick column price - calculate; save, repeat for year 2001, ... 2002...
Any help is appreciated.
CodePudding user response:
import pandas as pd
data = {'product_name': ['2000', '2001', '2000', '2001', '2002', '2002', '2001', '2000'],
'price': [1200, 150, 300, 450, 200, 300, 450, 200],
'quantity': [50, 15, 30, 450, 20, 30, 40, 27]
}
df = pd.DataFrame(data)
average_price = df.groupby(['product_name'])['price'].mean()
average_quantity = df.groupby(['product_name'])['quantity'].mean()
average_price:
0 | |
---|---|
2000 | 566.66 |
2001 | 350 |
2002 | 250 |
average_quantity:
0 | |
---|---|
2000 | 35.66 |
2001 | 168.33 |
2002 | 25 |
CodePudding user response:
df.groupby('product_name').mean()