Home > Back-end >  Modify time column to exclusive and inclusive time in Pandas DataFrame
Modify time column to exclusive and inclusive time in Pandas DataFrame

Time:11-24

I have the following DataFrame of individuals and the time of an event.

id    time
1     0
2     0
3     0
4     0
2     1
3     1
1     2
4     2
1     3
2     3
1     4
2     4
3     4
4     4

I want a column of left exclusive time points (start: time of the previous event). The column of right inclusive time points (stop) is the column time.

id    start     stop
1     0         0
2     0         0
3     0         0
4     0         0
2     0         1
3     0         1
1     0         2
4     0         2
1     2         3
2     1         3
1     3         4
2     3         4
3     1         4
4     2         4

Any straightforward functions that accomplish this?

CodePudding user response:

Use DataFrameGroupBy.shift in DataFrame.insert, for get new column like second column, last rename column:

df.insert(1, 'start', df.groupby('id')['time'].shift(fill_value=0))
df = df.rename(columns={'time':'stop'})
print (df)
    id  start  stop
0    1      0     0
1    2      0     0
2    3      0     0
3    4      0     0
4    2      0     1
5    3      0     1
6    1      0     2
7    4      0     2
8    1      2     3
9    2      1     3
10   1      3     4
11   2      3     4
12   3      1     4
13   4      2     4

CodePudding user response:

To get the previous value of every id, you want to group by 'id' and retrieve the previous value by using shift as your new column 'start':

df['start'] = df.groupby('id').time.shift(1, fill_value=0)

    id  time    start
0   1   0   0.0
1   2   0   0.0
2   3   0   0.0
3   4   0   0.0
4   2   1   0.0
5   3   1   0.0
6   1   2   0.0
7   4   2   0.0
8   1   3   2.0
9   2   3   1.0
10  1   4   3.0
11  2   4   3.0
12  3   4   1.0
13  4   4   2.0

Then you might want to rename your 'time' column to 'end':

df.rename({'time':'end'}, axis=1, inplace=True) If you want the switch start and end, reshuffle your columns like this:

df[['id', 'start', 'end']]

  • Related