I have 3 DataFrames in Python Pandas like below:
df1 (ID - int, TIME - datetime)
ID | TIME
----|------
123 | 2022-07-18
333 | 2022-07-22
444 | 2022-07-19
... | ...
df2 (both int)
ID | VALUE
----|------
123 | 556
333 | 12
444 | 88
... | ...
df3 (both int)
ID | TIME
----|------
123 | 11114
333 | 2
444 | 23
... | ...
And I need to make a merge:
- if TIME in df1 is < 2022-07-19 merge df1 with df2
- if TIME in df1 is >= 2022-07-19 merge df1 with df3
So as a result I need something like below:
ID | TIME | VALUE
----|------------|-------
123 | 2022-07-18 | 556
333 | 2022-07-22 | 2
444 | 2022-07-19 | 23
... | ... | ...
How can I do that in Python Pandas? OF course merge by ID col :)
CodePudding user response:
If there are same index, same order ID
in all 3 DataFrames use numpy.where
:
df1['VALUE'] = np.where(df1['TIME'] < '2022-07-19', df2['VALUE'], df3['VALUE'])
If not use Series.map
:
df1['VALUE'] = np.where(df1['TIME'] < '2022-07-19',
df1['ID'].map(df2.set_index('ID')['VALUE']),
df1['ID'].map(df3.set_index('ID')['VALUE']))