Ticker & Date are index
Dataframe-A
Ticker | Date | Renue |
---|---|---|
AAPL | 2022-01-01 | 10M |
AAPL | 2022-04-01 | 10M |
MSFT | 2022-01-01 | 10M |
MSFT | 2022-04-01 | 10M |
Dataframe-B
Ticker | Date | Price |
---|---|---|
AAPL | 2022-01-02 | 11 |
AAPL | 2022-01-03 | 12 |
AAPL | 2022-04-01 | 15 |
AAPL | 2022-04-02 | 16 |
MSFT | 2022-01-01 | 20 |
MSFT | 2022-01-02 | 21 |
MSFT | 2022-01-03 | 22 |
MSFT | 2022-04-02 | 26 |
MSFT | 2022-04-03 | 26 |
Join the Dataframe by Ticker and Date
But AAPL without 2022-01-01 Price, use 2022-01-02
Result
Ticker | Date | Renue | Price |
---|---|---|---|
AAPL | 2022-01-01 | 10M | 11 |
AAPL | 2022-04-01 | 10M | 15 |
MSFT | 2022-01-01 | 10M | 20 |
MSFT | 2022-04-01 | 10M | 26 |
How to join the DatFrame to produce the result Dataframe?
Thanks a lot!
CodePudding user response:
You can use merge as given below,
>>> data1 = pd.DataFrame({"col":range(11, 17), # Create first
pandas DataFrame
... "x1":range(12, 1, - 2),
... "x2":["a", "b", "c", "d", "e", "f"],
... "x3":range(17, 11, - 1)})
>>> print(data1)
col x1 x2 x3
0 11 12 a 17
1 12 10 b 16
2 13 8 c 15
3 14 6 d 14
4 15 4 e 13
5 16 2 f 12
>>> data2 = pd.DataFrame({"col":range(15, 19), # Create second
pandas DataFrame
... "y1":["l", "k", "j", "h"],
... "y2":["x", "y", "y", "y"],
... "y3":range(18, 10, - 2)})
>>> print(data2) # Print second
pandas DataFrame
col y1 y2 y3
0 15 l x 18
1 16 k y 16
2 17 j y 14
3 18 h y 12
>>> data1
col x1 x2 x3
0 11 12 a 17
1 12 10 b 16
2 13 8 c 15
3 14 6 d 14
4 15 4 e 13
5 16 2 f 12
>>> data2
col y1 y2 y3
0 15 l x 18
1 16 k y 16
2 17 j y 14
3 18 h y 12
>>> data_join = pd.merge(data1, # Inner join
... data2,
... on = "col")
>>> data_join
col x1 x2 x3 y1 y2 y3
0 15 4 e 13 l x 18
1 16 2 f 12 k y 16
Taken from here.
CodePudding user response:
here is one way to do it
it is case of less than or equal join. We first merge the two DFs, on Ticker, and then filter(query) where the date among two matches or DF1 date is less than the second DF date. We then groupby Ticket, and date and take the first row for each group. Finally, reset index and drop the date from right column
df1.merge(df2, on='Ticker',
how='left',
suffixes=('','_y')
).query('Date<=Date_y').groupby(
['Ticker','Date']).first().reset_index().drop(columns='Date_y')
Ticker Date Renue Price
0 AAPL 2022-01-01 10M 11
1 AAPL 2022-04-01 10M 15
2 MSFT 2022-01-01 10M 20
3 MSFT 2022-04-01 10M 26