Here is my data structure:
ids = 1
timezones = ['America/Los_Angeles', 'Europe/Paris', 'America/New_York']
utc_time = ['2020-09-03 19:36:18.534', '2020-09-03 19:36:18.534', '2020-09-03 19:36:18.534']
df = pd.DataFrame({'id':ids,
'timezone':timezones,
'utc_time':utc_time})
print(df)
id timezone utc_time
0 1 America/Los_Angeles 2020-09-03 19:36:18.534
1 1 Europe/Paris 2020-09-03 19:36:18.534
2 1 America/New_York 2020-09-03 19:36:18.534
How can I generate a new column added to this dataframe that converts the UTC time to the local time based on the timezone? I have tried a few solutions I've found but haven't been able to get them to work in this format whereby the timezone changes for each row.
Thanks in advance.
CodePudding user response:
This is an instance where I have to use .apply
through the rows to get what you need.
The basic steps are:
- convert your datestring to timestamps
- localize to UTC
- convert row-by-row
import pandas
ids = 1
timezones = ['America/Los_Angeles', 'Europe/Paris', 'America/New_York']
datestrings = ['2020-09-03 19:36:18.534', '2020-09-03 19:36:18.534', '2020-09-03 19:36:18.534']
df = pandas.DataFrame({
'id': ids,
'timezone': timezones,
'datestring': datestrings
}).assign(
# steps 1 & 2:
utc_time=lambda df: pandas.to_datetime(df["datestring"]).dt.tz_localize("UTC"),
# step 3:
local_time=lambda df: df.apply(lambda r: r["utc_time"].tz_convert(r["timezone"]), axis=1)
)
id timezone datestring utc_time local_time
1 America/Los_Angeles 2020-09-03 19:36:18.534 2020-09-03 19:36:18.534000 00:00 2020-09-03 12:36:18.534000-07:00
1 Europe/Paris 2020-09-03 19:36:18.534 2020-09-03 19:36:18.534000 00:00 2020-09-03 21:36:18.534000 02:00
1 America/New_York 2020-09-03 19:36:18.534 2020-09-03 19:36:18.534000 00:00 2020-09-03 15:36:18.534000-04:00
CodePudding user response:
from dateutil import tz
from datetime import datetime
import pandas as pd
def convert(row):
from_zone = tz.gettz('UTC')
to_zone = tz.gettz(row['timezone'])
utc = datetime.strptime(row['utc_time'], '%Y-%m-%d %H:%M:%S')
utc = utc.replace(tzinfo=from_zone)
target = utc.astimezone(to_zone)
return target.strftime('%m/%d/%Y %H:%M:%S %Z')
data = {'id': [1, 1, 1], 'timezone': ['America/Los_Angeles', 'Europe/Paris', 'America/New_York'], 'utc_time': ['2020-09-03 19:36:18', '2020-09-03 19:36:18', '2020-09-03 19:36:18']}
df = pd.DataFrame(data=data)
df['local_time'] = df.apply(lambda row: convert(row), axis=1)
print(df)
Result is as follows.
id timezone utc_time local_time
0 1 America/Los_Angeles 2020-09-03 19:36:18 09/03/2020 12:36:18
1 1 Europe/Paris 2020-09-03 19:36:18 09/03/2020 21:36:18
2 1 America/New_York 2020-09-03 19:36:18 09/03/2020 15:36:18