How can I create a new data frame based on the existing columns? It should calculate the average of the column 'a' for each same x. For example: a_new = sum the 'a' values and divide 3 where x=1. And also, for x=2, x=3,....
import pandas as pd
data = {'x': [ 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4], 'a': [0.4, 0.88, 0.2, 0.1, 0.75, 0.98, 0.33, 0.22, 0.15, 0.14, 0.73, 0.25], 'year': [2000, 2000, 2000, 2000, 2001, 2001, 2001, 2001, 2002, 2002, 2002, 2002]}
df = pd.DataFrame(data)
df
x a year
0 1 0.40 2000
1 2 0.88 2000
2 3 0.20 2000
3 4 0.10 2000
4 1 0.75 2001
5 2 0.98 2001
6 3 0.33 2001
7 4 0.22 2001
8 1 0.15 2002
9 2 0.14 2002
10 3 0.73 2002
11 4 0.25 2002
Expected Output:
x a_new
0 1 0.30
1 2 0.66
2 3 0.42
3 4 0.19
CodePudding user response:
This might be what you're after.
df.groupby(['x']).mean()['a']
x
1 0.433333
2 0.666667
3 0.420000
4 0.190000
Name: a, dtype: float64