I have a pandas df
with two columns year
and week_number
.
df = pd.DataFrame({'year': [2019, 2020, 2021, 2022], 'week_number':[3,12,38,42]})
df
year week_number
0 2019 3
1 2020 12
2 2021 38
3 2022 42
I know I can apply something like following to each row and convert them to datetime values however, I want to know if there is more efficient way to do this for the big dataframes and store the results in third column?
import datetime
single_day = "2013-26"
converted_date = datetime.datetime.strptime(single_day '-1', "%Y-%W-%w")
print(converted_date)
CodePudding user response:
I wouldn't say your way is inefficient, but if you want a fully vectorized way, without having to import another library, and which appends your dataframe, this might be what you're looking for
import pandas as pd df = pd.DataFrame({'year': [2019, 2020, 2021, 2022], 'week_number':[3,12,38,42]}) df['date'] = pd.to_datetime((df['year']*100 df['week_number']).astype(str) '0', format='%Y%W%w') df
CodePudding user response:
If you are on Python >= 3.8, use datetime.date.fromisocalendar. Also works for datetime.
# 11. May 2022 is a Wednsesday in the 19h week
>>> date.fromisocalendar(2022, 19, 3)
datetime.date(2022, 5, 11)
As new Column:
df['date'] = df[['year', 'week_number']].apply(lambda args: date.fromisocalendar(args[0], args[1], 1), axis=1)
CodePudding user response:
Use apply to loop over rows (axis=1) and a lambda function that concatenates the two columns as a string and then do exactly the thing you did it above :) Perhaps this wasn't the answer you were looking for thou, since you looking for the most efficent solution. However, this does the job!
df['convert_date']=df.apply(lambda x: datetime.strptime(f"{x.year}-{x.week_number}" '-1', "%Y-%W-%w"), axis=1)