I am reading a .csv
data file using pd.read_csv
and I get these first 5 rows from my global dataframe (containing thousands of rows):
time id time_offset
0 2017-12-01 21:00:00 0 -60
1 2017-12-01 21:01:00 0 -59
2 2017-12-01 21:02:00 0 -58
3 2017-12-01 21:03:00 0 -57
4 2017-12-01 21:04:00 0 -56
I'm not very good at manipulating dates in Python and I haven't found how to do this manipulation:
- create in my dataframe a new
hour
column from the existingtime
column, containing only thehours:minutes:seconds
data, which should be:21:00:00
,21:01:00
,21:02:00
, etc... - then create another column
seconds
from the newly createdhour
, containing the number of seconds elapsed since time0
, which should be:75600
(calculated as 21x3600),75601
(calculated ,as 21x3600 1), etc...
Any help in sorting this out would be much appreciated.
CodePudding user response:
You can try:
# convert `time` column to datetime (if necessary):
df["time"] = pd.to_datetime(df["time"])
df["hour"] = df["time"].dt.time
df["seconds"] = (
df["time"].dt.hour * 60 * 60
df["time"].dt.minute * 60
df["time"].dt.second
)
print(df)
Prints:
time id time_offset hour seconds
0 2017-12-01 21:00:00 0 -60 21:00:00 75600
1 2017-12-01 21:01:00 0 -59 21:01:00 75660
2 2017-12-01 21:02:00 0 -58 21:02:00 75720
3 2017-12-01 21:03:00 0 -57 21:03:00 75780
4 2017-12-01 21:04:00 0 -56 21:04:00 75840
CodePudding user response:
Assignment of the datetime series as the index is typically useful. Use pd.to_datetime() converts it to a usable format.
df.index = pd.to_datetime(df['time'])
df.drop('time',axis=1)
- can use the strftime function - https://strftime.org/
df['time'] = df.index.strftime("%H:%M:%S")
- since
df.index[0]
is the very first time you can subtract and use.seconds
attribute:
df['seconds since'] = (df.index = df.index[0]).seconds