Home > Software design >  How to merge rows with a specific id with the previous row in pandas?
How to merge rows with a specific id with the previous row in pandas?

Time:02-23

I have a Timeseries dataframe with different values:

ID  TimeString            value1  value2   StampDif
0   2021-02-10 17:30:39   0.5     5.2      NaT
1   2021-02-10 17:33:39   0.7     5.5      0 days 00:03:00
2   2021-02-10 17:36:40   0.9     5.5      0 days 00:03:01
3   2021-02-10 17:39:40   0.6     5.4      0 days 00:03:00
4   2021-02-10 17:42:40   0.8     5.0      0 days 00:00:01
.
.
.

Now I want to merge all rows that have a Stampdif of 1 second with the previous row using the mean. I have tried:

secdf = df[df["StampDif"] <= pd.Timedelta(1, "sec")]

for idx, row in secdf.iterrows():
    df.iloc[idx-1, dfnanpv.columns != ["TimeString", "StampDif"]] = df.iloc[idx-1:idx 1].mean(axis=0)

But it throws and error: 'Shapes must match', (25,), (2,)

As a result, I want the following for example in row 3:

ID  TimeString            value1  value2   StampDif
3   2021-02-10 17:39:40   0.7     5.2      0 days 00:03:00

CodePudding user response:

Try this:

exclude_columns = ['col1', 'col2']
new_df = df.groupby((df['StampDif'] > pd.Timedelta(1, 'second')).cumsum()).agg({col: 'mean' for col in df.columns.difference(exclude_columns)}).reset_index(drop=True)

Output:

>>> new_df
           TimeString  value1  value2
0 2021-02-10 17:30:39     0.5     5.2
1 2021-02-10 17:33:39     0.7     5.5
2 2021-02-10 17:36:40     0.9     5.5
3 2021-02-10 17:41:10     0.7     5.2
  • Related