Home > Blockchain >  Pandas create new column based on previous and after row values
Pandas create new column based on previous and after row values

Time:02-16

I have a dataframe with time series data, as follows:

time                user   location   reference
2/15/2022 22:09      u1       1         no
2/15/2022 23:09      u1       2         yes
2/16/2022 06:09      u1       2         yes
2/16/2022 07:09      u1       1         no
2/16/2022 22:08      u2       1         no
2/16/2022 23:08      u2       4         yes
2/16/2022 23:58      u2       2         no

And what i want to do is create 2 new columns to extract the location visited before and after visiting the reference location, for each user, for all days Output:

time                user   location   reference   before_reference   after_reference 
2/15/2022 22:09      u1       1         no            yes               no
2/15/2022 23:09      u1       2         yes           no               no
2/16/2022 06:09      u1       2         yes           no               no
2/16/2022 07:09      u1       1         no            no              yes
2/16/2022 22:08      u2       1         no            yes              no
2/16/2022 23:08      u2       4         yes           no               no
2/16/2022 23:58      u2       2         no            no              yes

Note that the users may be mixed and not followed!

CodePudding user response:

IIUC, you could perform a groupby shift:

# first ensure values are sorted by time
df['time'] = pd.to_datetime(df['time'])
df = df.sort_values(by=['user', 'time'])

# use booleans
s = df['reference'].eq('yes')

# group by user
g = s.groupby(df['user'])

# compute the shifts and exclude reference == yes, then remap to yes/no
d = {True: 'yes', False: 'no'}
df['before_reference'] = (~s & g.shift(-1, fill_value=False)).map(d)
df['after_reference'] = (~s & g.shift(fill_value=False)).map(d)

output:

              time user  location reference before_reference after_reference
0  2/15/2022 22:09   u1         1        no              yes              no
1  2/15/2022 23:09   u1         2       yes               no              no
2  2/16/2022 06:09   u1         2       yes               no              no
3  2/16/2022 07:09   u1         1        no               no             yes
4  2/16/2022 22:08   u2         1        no              yes              no
5  2/16/2022 23:08   u2         4       yes               no              no
6  2/16/2022 23:58   u2         2        no               no             yes
  • Related