Home > front end >  Getting mean value with multiple conditions(by columns) on pandas Dataframe
Getting mean value with multiple conditions(by columns) on pandas Dataframe

Time:10-21

I recently start using pandas but couldn't find the answer for this. (Maybe just because I don't know the query words to search)

Here's the example code

test1 = {'col1' : ["A","A","A","A","B","B","B","B"],
         'col2' : ["C","D","C","D","C","D","C","D"],
         'year' :["2012","2012","2013","2013","2012","2012","2013","2013"],
         'value' : [3,4,25,1,3,2,1,3]}
df_t = pd.DataFrame(data=test1)
df_t

In this DataFrame, I want to calculate mean of values with some condition. col1 and col2 should be coupled(cuz it's trade data), so what I want to get is

Mean value of years of 'A(col1)', 'C(col2)'
Mean value of years of 'A', 'D'
Mean value of years of 'B', 'C'
...
etc.

As a result, I want a new DataFrame with 'col1', 'col2', 'mean_value' columns. But I think I can handle it if anyone teaches me how to get a mean value with those conditions above.

If anyone can enlighten me, that would be much appreciated.

Thank you for your interests!

Have a wonderful day.

CodePudding user response:

try df_t.groupby(['col1', 'col2'])['value'].mean().reset_index()

  col1 col2  value
0    A    C   14.0
1    A    D    2.5
2    B    C    2.0
3    B    D    2.5
  • Related