This is how my dataset looks,
Temp (C) Rel Hum (%) Wind Spd (km/h) St
0 8.1 81 4 0.0
1 8.1 79 4 2.0
2 8.1 78 4 3.0
3 8.1 80 4 3.0
4 8.1 78 4 2.0
5 8.1 78 4 3.0
6 8.1 81 4 3.0
7 8.1 78 4 2.0
8 8.1 80 4 3.0
9 8.1 78 4 2.0
10 8.1 77 4 3.0
11 8.1 81 4 3.0
12 8.1 82 4 2.0
13 8.1 78 4 3.0
14 8.1 79 4 3.0
What I want is to take sum of first n-rows of "St" and replace the n-rows with the single row where the sum of n-rows of "St" will be placed. Also, the single row will contain average values of n-rows, like "Rel Hum(%)".
That will look something like this with n=5,
Temp (C) Rel Hum (%) Wind Spd (km/h) St
0 8.1 79.2 4 10.0
1 8.1 79 4 13.0
2 8.1 79.4 4 14.0
I have tried different solutions like this,
df['St'] = df['St'].groupby(df.index // N).sum()
How I can use groupby() or any other way to achieve this?
CodePudding user response:
Use GroupBy.agg
instead sum
and not assign to column, but to new DataFrame
:
df = df.apply(pd.to_numeric, errors='coerce')
N = 5
df = df.groupby(df.index // N).agg({'Temp (C)':'mean',
'Rel Hum (%)':'mean',
'Wind Spd (km/h)':'mean',
'St':'sum'})
print (df)
Temp (C) Rel Hum (%) Wind Spd (km/h) St
0 8.1 79.2 4 10.0
1 8.1 79.0 4 13.0
2 8.1 79.4 4 14.0