I have a DataFrame of "positions" and the "date they came to office":
position cameToOfficeDate
CEO 2020-06-01
CEO 2021-01-01
CEO 2021-02-02
Slave 2020-01-01
I want to map the position
to employee's names for different cameToOfficeDate
but based on the commencementDate
of the position
s too. For example, the CEO was Bill before 2021 and was Mark after 2021.
position commenceDate Name
CEO 2020-01-01 Bill
CEO 2021-01-01 Mark
Slave 2020-01-01 Elon
So the output should be:
position cameToOfficeDate Name
CEO 2020-06-01 Bill
CEO 2021-01-01 Mark
CEO 2021-02-02 Mark
Slave 2020-01-01 Elon
I am wondering how to create the mapping table structure. I tried using a dictionary but it doesn't allow comparing whether the cameToOfficeDate
date is smaller than or bigger than the commencementDate
of the position
. Tried joining the table together on position too and it doesn't work.
CodePudding user response:
You could use merge_asof
. Since merge_asof
expects the keys to be sorted, we convert the dates to datetime objects and sort; then merge
.
df1['cameToOfficeDate'] = pd.to_datetime(df1['cameToOfficeDate'])
df2['commenceDate'] = pd.to_datetime(df2['commenceDate'])
out = (pd.merge_asof(df1.sort_values(by='cameToOfficeDate'),
df2.sort_values(by='commenceDate'),
left_on='cameToOfficeDate', right_on='commenceDate',
by='position')
.drop(columns='commenceDate')
.sort_values(by=['position','cameToOfficeDate']))
Output:
position cameToOfficeDate Name
1 CEO 2020-06-01 Bill
2 CEO 2021-01-01 Mark
3 CEO 2021-02-02 Mark
0 Slave 2020-01-01 Elon