Home > OS >  remove timezone from timestamp column of pandas dataframe
remove timezone from timestamp column of pandas dataframe

Time:09-20

Data loaded with column as pandas date time:

df = pd.read_csv('test.csv', parse_dates=['timestamp'])
df


   user         timestamp           speed
0   2   2016-04-01 01:06:26 01:00   9.76
1   2   2016-04-01 01:06:26 01:00   5.27
2   2   2016-04-01 01:06:26 01:00   8.12
3   2   2016-04-01 01:07:53 01:00   8.81

I want to remove time zone information from timestamp column:


df['timestamp'].tz_convert(None)

TypeError: index is not a valid DatetimeIndex or PeriodIndex

CodePudding user response:

For this solution to work the column should be datetime

df['timestamp'].dt.tz_localize(None)

CodePudding user response:

Given strings in your csv like "2016-04-01 01:06:26 01:00", I can think of the following options:

import pandas as pd

# will only work if *all* your timestamp contain " hh:mm"
df = pd.read_csv('test.csv', parse_dates=['timestamp'])
df['timestamp'] = df.timestamp.dt.tz_localize(None)

print(df.timestamp.dtype)
datetime64[ns]

df = pd.read_csv('test.csv')
df['timestamp'] = pd.to_datetime(df.timestamp.str.split(' ', expand=True)[0])

print(df.timestamp.dtype)
datetime64[ns]

df = pd.read_csv('test.csv', parse_dates=['timestamp'],
                 date_parser=lambda x: pd.to_datetime(x.split(' ')[0]))

print(df.timestamp.dtype)
datetime64[ns]
  • Related