Home > OS >  Select rows together with their previous rows given a condition for a pandas dataframe
Select rows together with their previous rows given a condition for a pandas dataframe

Time:04-07

Given a DataFrame, where the "time" attribute is a Python timedelta object. The index attribute is internal to the pandas dataframe, I simply show it to make it easier to understand the problem.

index ID time country
0 0 NaT ITL
1 0 0 days 01:00:00 USA
2 0 NaT ITL
3 0 NaT ESP
4 0 2 days 00:05:00 USA
5 1 NaT ITL
6 1 0 days 03:35:00 USA
7 1 NaT ESP
8 2 NaT USA
9 3 NaT FRA
10 3 2 days 04:14:10 ESP
11 3 1 days 03:35:00 ITL

I divide the functionality in 2 steps:

  1. Select the rows whose value of the attribute "time" is different from NaT.
index ID time country
1 0 0 days 01:00:00 USA
4 0 2 days 00:05:00 USA
6 1 0 days 03:35:00 USA
10 3 2 days 04:14:10 ESP
11 3 1 days 03:35:00 ITL
  1. Select for each row of the above subset, its previous index row as long as it belongs to the same ID.
index ID time country
0 0 NaT ITL
1 0 0 days 01:00:00 USA
3 0 NaT ESP
4 0 2 days 00:05:00 USA
5 1 NaT ITL
6 1 0 days 03:35:00 USA
9 3 NaT FRA
10 3 2 days 04:14:10 ESP
11 3 1 days 03:35:00 ITL

The DataFrame I want is the one above (without the index column, which is internal to pandas). I have split the functionality into 2 steps to make the problem understandable, however I need to get to step 2, it is not necessary that the functionality of step 1 comes implicit.

Thanks for your help in advance.

CodePudding user response:

Chain masks for testing non missing values with shifted values per groups by DataFrameGroupBy.shift with | for bitwise OR in boolean indexing:

m = df['time'].notna()
df = df[m | m.groupby(df['ID']).shift(-1,fill_value=False)]
print (df)
    ID            time country
0    0             NaT     ITL
1    0 0 days 01:00:00     USA
3    0             NaT     ESP
4    0 2 days 00:05:00     USA
5    1             NaT     ITL
6    1 0 days 03:35:00     USA
9    3             NaT     FRA
10   3 2 days 04:14:10     ESP
11   3 1 days 03:35:00     ITL
  • Related