Home > other >  Select specific rows and columns - loop - dataframe
Select specific rows and columns - loop - dataframe

Time:07-06

I have a dataframe similar to this:

import pandas as pd

data = {'product_name': ['2000', '2001', '2000', '2001', '2002', '2002', '2001', '2000'],
        'price': [1200, 150, 300, 450, 200, 300, 450, 200],
        'quantity': [50, 15, 30, 450, 20, 30, 40, 27]
        }

df = pd.DataFrame(data)

print (df)

For example, I want to calculate the average price and quantity for each year (i.e., 2000, 2001, 2002) and save them to use later. I have tried different ways, but they do not work. I can do one by one but that is not ideal! I wonder if there is a way to write a loop to do it. Pick column 'product_name'; pick year (say 2000), then pick column price - calculate; save, repeat for year 2001, ... 2002...

Any help is appreciated.

CodePudding user response:

import pandas as pd

data = {'product_name': ['2000', '2001', '2000', '2001', '2002', '2002', '2001', '2000'],
       'price': [1200, 150, 300, 450, 200, 300, 450, 200],
        'quantity': [50, 15, 30, 450, 20, 30, 40, 27]
        }
df = pd.DataFrame(data)

average_price = df.groupby(['product_name'])['price'].mean()

average_quantity = df.groupby(['product_name'])['quantity'].mean()

average_price:

0
2000 566.66
2001 350
2002 250

average_quantity:

0
2000 35.66
2001 168.33
2002 25

CodePudding user response:

df.groupby('product_name').mean()
  • Related