Home > Software engineering >  How can I create a new data frame based on the existing columns?
How can I create a new data frame based on the existing columns?

Time:04-13

How can I create a new data frame based on the existing columns? It should calculate the average of the column 'a' for each same x. For example: a_new = sum the 'a' values and divide 3 where x=1. And also, for x=2, x=3,....

import pandas as pd
data = {'x': [ 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4], 'a': [0.4, 0.88, 0.2, 0.1, 0.75, 0.98, 0.33, 0.22, 0.15, 0.14, 0.73, 0.25], 'year': [2000, 2000, 2000, 2000, 2001, 2001, 2001, 2001, 2002, 2002, 2002, 2002]}   
df = pd.DataFrame(data)
df

    x    a      year
0   1   0.40    2000
1   2   0.88    2000
2   3   0.20    2000
3   4   0.10    2000
4   1   0.75    2001
5   2   0.98    2001
6   3   0.33    2001
7   4   0.22    2001
8   1   0.15    2002
9   2   0.14    2002
10  3   0.73    2002
11  4   0.25    2002

Expected Output:

    x   a_new
0   1   0.30
1   2   0.66
2   3   0.42
3   4   0.19

CodePudding user response:

This might be what you're after.

df.groupby(['x']).mean()['a']
x
1    0.433333
2    0.666667
3    0.420000
4    0.190000
Name: a, dtype: float64

  • Related