I recently start using pandas but couldn't find the answer for this. (Maybe just because I don't know the query words to search)
Here's the example code
test1 = {'col1' : ["A","A","A","A","B","B","B","B"],
'col2' : ["C","D","C","D","C","D","C","D"],
'year' :["2012","2012","2013","2013","2012","2012","2013","2013"],
'value' : [3,4,25,1,3,2,1,3]}
df_t = pd.DataFrame(data=test1)
df_t
In this DataFrame, I want to calculate mean of values with some condition. col1 and col2 should be coupled(cuz it's trade data), so what I want to get is
Mean value of years of 'A(col1)', 'C(col2)'
Mean value of years of 'A', 'D'
Mean value of years of 'B', 'C'
...
etc.
As a result, I want a new DataFrame with 'col1', 'col2', 'mean_value' columns. But I think I can handle it if anyone teaches me how to get a mean value with those conditions above.
If anyone can enlighten me, that would be much appreciated.
Thank you for your interests!
Have a wonderful day.
CodePudding user response:
try df_t.groupby(['col1', 'col2'])['value'].mean().reset_index()
col1 col2 value
0 A C 14.0
1 A D 2.5
2 B C 2.0
3 B D 2.5