Home > Enterprise >  How to merge 3 DataFrames under certain conditions about date value in one of them in Python Pandas?
How to merge 3 DataFrames under certain conditions about date value in one of them in Python Pandas?

Time:08-20

I have 3 DataFrames in Python Pandas like below:

df1 (ID - int, TIME - datetime)

ID  | TIME
----|------
123 | 2022-07-18
333 | 2022-07-22
444 | 2022-07-19
... | ...

df2 (both int)

ID  | VALUE
----|------
123 | 556 
333 | 12  
444 | 88  
... | ...

df3 (both int)

ID  | TIME
----|------
123 | 11114 
333 | 2
444 | 23 
... | ...

And I need to make a merge:

  • if TIME in df1 is < 2022-07-19 merge df1 with df2
  • if TIME in df1 is >= 2022-07-19 merge df1 with df3

So as a result I need something like below:

ID  | TIME       | VALUE
----|------------|-------
123 | 2022-07-18 | 556
333 | 2022-07-22 | 2
444 | 2022-07-19 | 23
... | ...        | ...

How can I do that in Python Pandas? OF course merge by ID col :)

CodePudding user response:

If there are same index, same order ID in all 3 DataFrames use numpy.where:

df1['VALUE'] = np.where(df1['TIME'] < '2022-07-19', df2['VALUE'], df3['VALUE'])

If not use Series.map:

df1['VALUE'] = np.where(df1['TIME'] < '2022-07-19', 
                        df1['ID'].map(df2.set_index('ID')['VALUE']), 
                        df1['ID'].map(df3.set_index('ID')['VALUE']))
  • Related