Home > OS >  Adding date/time using Pandas for burst datasets
Adding date/time using Pandas for burst datasets

Time:12-15

I have a logged dataset that is in bursts. It is set to 16Hz, so records 960 records a minute, then pauses for 300 seconds, then bursts again. So in my dataframe it is recorded as:

Sample data image

1 1 data data data

1 2 data data data

1 3 data data data

~~

1 960 data data data

2 1 data data data

2 2 data data data

And so on...

If I know the start time and date, can I add a new column and calculate a time/date for each event recorded in Pandas? Hope that all makes sense...

Successfully read file in Pandas using:

dataframe = pd.read_csv('filename.dat', sep='\s ', header=None)

CodePudding user response:

Sample

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'cycle': [i for l in [[i for _ in range(0,960)] for i in range(1,21)] for i in l],
    'record': [i for i in range(1,961)]*20
})

print(df)
Output:
       cycle  record
0          1       1
1          1       2
2          1       3
3          1       4
4          1       5
...      ...     ...
19195     20     956
19196     20     957
19197     20     958
19198     20     959
19199     20     960

[19200 rows x 2 columns]

Solution

You can use pd.date_range and pd.Timedelta to achieve your goals:

startdate = pd.to_datetime('2020-12-13 23:30:02')
period = lambda x: pd.date_range(x, x   pd.Timedelta(1,'min'), periods=960)

dr = period(startdate)
while(len(dr) < len(df)):
    followdate = dr[-1]   pd.Timedelta(300,'sec')
    dr = dr.append(period(followdate))

df['time'] = dr[0:len(df)]
print(df)
Output:
       cycle  record                          time
0          1       1 2020-12-13 23:30:00.000000000
1          1       2 2020-12-13 23:30:00.062565172
2          1       3 2020-12-13 23:30:00.125130344
3          1       4 2020-12-13 23:30:00.187695516
4          1       5 2020-12-13 23:30:00.250260688
...      ...     ...                           ...
19195     20     956 2020-12-14 01:24:59.749739311
19196     20     957 2020-12-14 01:24:59.812304483
19197     20     958 2020-12-14 01:24:59.874869655
19198     20     959 2020-12-14 01:24:59.937434827
19199     20     960 2020-12-14 01:25:00.000000000

[19200 rows x 3 columns]
  • Related