I have a data frame with multiple columns (30/40) in a time series continuously from 1 to 1440 minutes.
df
time colA colB colC.....
1 5 4 3
2 1 2 3
3 5 4 3
4 6 7 3
5 9 0 3
6 4 4 0
..
Now I want to add two row values into one but I want to keep the interval of index 'time' same as the row number I am adding. The resulted data frame is:
df
time colA colB colC.......
1 6 6 6
3 11 11 6
5 13 4 3
..
Here I added two row values into one but the time index interval is also same as 2 rows. 1,3,5... Is it possible to achieve that?
CodePudding user response:
one way is to do the addition for all and then fix time:
df_new = df[1::2].reset_index(drop=True) df[::2].reset_index(drop=True)
df_new['time'] = df[::2]['time'].values
CodePudding user response:
Another way would be to group your data set every two rows and aggregate with using sum
on your 'colX' columns and mean
on your time column. Chaining astype(int)
will round the resulting values:
d = {col: 'sum' for col in [c for c in df.columns if c.startswith('col')]}
df.groupby(df.index // 2).agg({**d,'time': 'mean'}).astype(int)
prints back:
colA colB colC time
0 6 6 6 1
1 11 11 6 3
2 13 4 3 5