I have a pandas dataframe as following:
Date time LifeTime1 LifeTime2 LifeTime3 LifeTime4 LifeTime5
2020-02-11 17:30:00 6 7 NaN NaN 3
2020-02-11 17:30:00 NaN NaN 3 3 NaN
2020-02-12 15:30:00 2 2 NaN NaN 3
2020-02-16 14:30:00 4 NaN NaN NaN 1
2020-02-16 14:30:00 NaN 7 NaN NaN NaN
2020-02-16 14:30:00 NaN NaN 8 2 NaN
The dates are identical for some rows, is it possible to add 1 second, 2 second, 3 seconds to 2, 3, and 4 identical dates? So if its just one unique date, leave as is. If there are two identical dates, leave first one as is but add 1 second to the second identical date. And if three identical date, leave first as is, second add 1 second and add 2 second to third one. Is this possible to do easily in pandas?
CodePudding user response:
You can use groupby.cumcount
combined with pandas.to_datetime
with unit='s'
to add incremental seconds to the duplicated rows:
s = pd.to_datetime(df['Date time'])
df['Date time'] = s pd.to_timedelta(s.groupby(s).cumcount(), unit='s')
As a one liner with python 3.8 walrus operator:
df['Date time'] = ((s:=pd.to_datetime(df['Date time']))
pd.to_timedelta(s.groupby(s).cumcount(), unit='s')
)
output:
Date time LifeTime1 LifeTime2 LifeTime3 LifeTime4 LifeTime5
0 2020-02-11 17:30:00 6.0 7.0 NaN NaN 3.0
1 2020-02-11 17:30:01 NaN NaN 3.0 3.0 NaN
2 2020-02-12 15:30:00 2.0 2.0 NaN NaN 3.0
3 2020-02-16 14:30:00 4.0 NaN NaN NaN 1.0
4 2020-02-16 14:30:01 NaN 7.0 NaN NaN NaN
5 2020-02-16 14:30:02 NaN NaN 8.0 2.0 NaN