Home > database >  Pandas - Join two dataframes by groupby with conditions
Pandas - Join two dataframes by groupby with conditions

Time:06-28

I would like to left join table_B using table_A on column ID and groupby the number of records table A has when the date in table_A is smaller than the one in talbe_B. For example,

table_A = 
| ID | Date |
| ----| -------------| 
| 0   | 2022-03-01   |
| 1   | 2022-03-02   |
| 0   | 2022-05-02   |
| 1   | 2022-04-02   |


table_B = 

| ID | Date         |
| -- | ------------ | 
| 0   | 2022-05-04  |
| 1   | 2022-03-20  |
| 1   | 2022-05-02  |
| 2   | 2022-02-22  |

The resulting table should be:

| ID | Date       | number_of_records |
| ---| -----------| ----------------  |
| 0  | 2022-05-04 | 2                 |
| 1  | 2022-03-20 | 1                 |
| 1  | 2022-05-02 | 2                 |
| 2  | 2022-02-22 | 0                 |

import datetime
table_A = pd.DataFrame({'ID': [0,1,0,1], 
        'Date':[datetime.date(2022,3,1), datetime.date(2022,3,2), datetime.date(2022,5,2), datetime.date(2022,4,2)]})
table_B = pd.DataFrame({'ID': [0,1,1,2], 
        'Date':[datetime.date(2022,5,4), datetime.date(2022,3,20), datetime.date(2022,5,2), datetime.date(2022,2,22)]})
# The desired output
table_C = pd.DataFrame({'ID': [0,1,1,2], 
        'Date':[datetime.date(2022,5,4), datetime.date(2022,3,20), datetime.date(2022,5,2), datetime.date(2022,2,22)], 
        'number_of_records': [2,1,2,0]})

I tried to find the unique ID in table B and use groupby to count in table_A but I am not sure how to apply the condition "smaller than the one in talbe_B".

CodePudding user response:

table_A.Date = pd.to_datetime(table_A.Date)
table_B.Date = pd.to_datetime(table_B.Date)
table_A.merge(table_B, on='ID', how='right', suffixes=['_count', None])[lambda x: x.Date_count.lt(x.Date) | x.Date_count.isna()].groupby(['ID', 'Date'], as_index=False).count()

Output:

   ID       Date  Date_count
0   0 2022-05-04           2
1   1 2022-03-20           1
2   1 2022-05-02           2
3   2 2022-02-22           0
  • Related