I am trying to add a Week # Column to df1. I want to assign week numbers based on date ranges found in df2.
df1
WorkDate
2022-05-03
2022-05-16
2022-05-24
df2
Week # Week Start Week End
1 2022-05-01 2022-05-07
2 2022-05-08 2022-05-14
3 2022-05-15 2022-05-21
4 2022-05-22 2022-05-28
5 2022-05-29 2022-06-04
Expected Results
final_df
WorkDate Week #
2022-05-03 1
2022-05-16 3
2022-05-24 4
CodePudding user response:
Try pd.merge_asof
(Note: the dataframes must be sorted):
df1["WorkDate"] = pd.to_datetime(df1["WorkDate"])
df2["Week Start"] = pd.to_datetime(df2["Week Start"])
x = pd.merge_asof(df1, df2, left_on="WorkDate", right_on="Week Start")
print(x[["WorkDate", "Week #"]])
Prints:
WorkDate Week #
0 2022-05-03 1
1 2022-05-16 3
2 2022-05-24 4
CodePudding user response:
One option is with conditional_join from pyjanitor:
# pip install pyjanitor
import pandas as pd
import janitor
df1["WorkDate"] = pd.to_datetime(df1["WorkDate"])
df2["Week Start"] = pd.to_datetime(df2["Week Start"])
df2['Week End'] = pd.to_datetime(df2['Week End'])
(df1
.conditional_join(
df2,
('WorkDate', 'Week Start', '>='),
('WorkDate', 'Week End', '<='))
)
WorkDate Week # Week Start Week End
0 2022-05-03 1 2022-05-01 2022-05-07
1 2022-05-16 3 2022-05-15 2022-05-21
2 2022-05-24 4 2022-05-22 2022-05-28