Home > Software design >  Convert a Python dataframe date column in seconds
Convert a Python dataframe date column in seconds

Time:12-22

I am reading a .csv data file using pd.read_csv and I get these first 5 rows from my global dataframe (containing thousands of rows):

    time                   id   time_offset
0   2017-12-01 21:00:00     0   -60
1   2017-12-01 21:01:00     0   -59
2   2017-12-01 21:02:00     0   -58
3   2017-12-01 21:03:00     0   -57
4   2017-12-01 21:04:00     0   -56

I'm not very good at manipulating dates in Python and I haven't found how to do this manipulation:

  1. create in my dataframe a new hour column from the existing time column, containing only the hours:minutes:seconds data, which should be: 21:00:00, 21:01:00, 21:02:00, etc...
  2. then create another column seconds from the newly created hour, containing the number of seconds elapsed since time 0, which should be: 75600 (calculated as 21x3600), 75601 (calculated ,as 21x3600 1), etc...

Any help in sorting this out would be much appreciated.

CodePudding user response:

You can try:

# convert `time` column to datetime (if necessary):
df["time"] = pd.to_datetime(df["time"])

df["hour"] = df["time"].dt.time
df["seconds"] = (
    df["time"].dt.hour * 60 * 60
      df["time"].dt.minute * 60
      df["time"].dt.second
)
print(df)

Prints:

                 time  id  time_offset      hour  seconds
0 2017-12-01 21:00:00   0          -60  21:00:00    75600
1 2017-12-01 21:01:00   0          -59  21:01:00    75660
2 2017-12-01 21:02:00   0          -58  21:02:00    75720
3 2017-12-01 21:03:00   0          -57  21:03:00    75780
4 2017-12-01 21:04:00   0          -56  21:04:00    75840

CodePudding user response:

Assignment of the datetime series as the index is typically useful. Use pd.to_datetime() converts it to a usable format.

df.index = pd.to_datetime(df['time'])
df.drop('time',axis=1)
  1. can use the strftime function - https://strftime.org/
df['time'] = df.index.strftime("%H:%M:%S")
  1. since df.index[0] is the very first time you can subtract and use .seconds attribute:
df['seconds since'] = (df.index = df.index[0]).seconds
  • Related