Home > other >  pandas: add row if column respects condition
pandas: add row if column respects condition

Time:11-05

dataframe

id            beg                       end     
guest1     2021-10-21 17:00:00       2021-10-21 18:00:00            
guest2     2021-10-21 10:00:00       2021-10-22 10:00:00       

I want to test

if the client appears exactly one time and beg and end are not the same day then do like (guest1). col1 gets 2021-10-21 00:00:00 and col2 gets beg

if the client appears exactly one time and beg and end are the same day then do like (guest2).

id               beg                   end                        col1                col2
guest1     2021-10-21 17:00:00   2021-10-21 18:00:00        2021-10-21 00:00:00   2021-10-21 17:00:00     
guest1     2021-10-21 17:00:00   2021-10-21 18:00:00        2021-10-21 18:00:00   2021-10-21 23:59:59
guest2     2021-10-21 10:00:00    2021-10-22 11:00:00      2021-10-21 00:00:00    2021-10-21 10:00:00

CodePudding user response:

You can add another column to see the difference in days and then filter out what you need.

data = [('guest1', '2021-10-21 17:00:00', '2021-10-21 18:00:00' ), ('guest2', '2021-10-21 17:00:00', '2021-10-22 18:00:00' )]
cols = ['id', 'beg', 'end' ]
pdf = pd.DataFrame(data, columns=cols)
pdf[['beg', 'end']] = pdf[['beg', 'end']].apply(pd.to_datetime)
pdf['n_days'] = (pdf['end'] - pdf['beg']).dt.days
       id                 beg                 end  n_days
0  guest1 2021-10-21 17:00:00 2021-10-21 18:00:00       0
1  guest2 2021-10-21 17:00:00 2021-10-22 18:00:00       1

CodePudding user response:

You can use value_counts, callable indexing and isin to filter your data to only include ones with a single row and use another callable index to filter your data according to your greater than 1 day condition.

df[df['id'].isin(df['id'].value_counts().loc[lambda x: x==1].index)].loc[lambda x: x['end'] - x['beg'] > pd.Timedelta('1 days')]
  • Related