Home > Mobile >  Select Rows Based on Time Difference [Before or After] In Columns
Select Rows Based on Time Difference [Before or After] In Columns

Time:08-18

I have the following dataset of students taking 2 different exams:

df = pd.DataFrame({'student': 'A B C D E'.split(),
                  'sat_date': [datetime.datetime(2013,4,1),datetime.datetime(2013,5,1),
                               datetime.datetime(2013,5,2),datetime.datetime(2013,7,15),
                               datetime.datetime(2013,8,1)],
                   'act_date': [datetime.datetime(2013,4,12),datetime.datetime(2013,5,2),
                               datetime.datetime(2013,4,12), datetime.datetime(2013,7,1),
                               datetime.datetime(2013,8,2)]})

print(df)

student   sat_date   act_date
0       A 2013-04-01 2013-04-12
1       B 2013-05-01 2013-05-02
2       C 2013-05-02 2013-04-12
3       D 2013-07-15 2013-07-01
4       E 2013-08-01 2013-08-02


I want to select those students whose two exams are 10 days apart from each other in either ways (any exam can be taken before or after).

I am trying Timedelta, but I'm not sure if it's optimal.

df[(df['sat_date'] >= df['act_date']   pd.Timedelta(days=10)) | (df['sat_date'] <= df['act_date'] - pd.Timedelta(days=10))]

Desired Output:

student sat_date    act_date
0   A   2013-04-01  2013-04-12
2   C   2013-05-02  2013-04-12
3   D   2013-07-15  2013-07-01

Is there any better way of getting the desired output? Any suggestions would be appreciated. Thanks!

CodePudding user response:

I would probably look at the absolute value of the difference between the two dates is greater to or equal than 10.

df.loc[abs((df['sat_date']-df['act_date']).dt.days).ge(10)]

CodePudding user response:

Try as follows:

result = df.loc[abs(df.sat_date - df.act_date).dt.days>=10]
print(result)

  student   sat_date   act_date
0       A 2013-04-01 2013-04-12
2       C 2013-05-02 2013-04-12
3       D 2013-07-15 2013-07-01

Or maybe nicer:

df.loc[abs(df.sat_date - df.act_date).ge(pd.Timedelta(days=10))]

CodePudding user response:

You can filter like this:

df[abs((df['sat_date'] - df['act_date']).days) == 10]
  • Related