I have 2 different datasets.
Table-1: df1
Date sales product
2021-08-01 10000 a
2021-08-02 575 a
2021-08-03 12212 a
2021-08-04 902 a
2021-08-05 456 a
Table-2: df2
Date sales product
2021-08-03 1000 b
2021-08-04 435 b
2021-08-05 759 b
2021-08-06 9123 b
2021-08-07 642 b
In python, I want to create a new data set which is all records of df2 that match df1 with date columns are added to df1 and I want to assign the value 0 for non-match dates.
New Dataset:
Date sales_a sales_b
2021-08-01 10000 0
2021-08-02 575 0
2021-08-03 12212 1000
2021-08-04 902 435
2021-08-05 456 759
How can I do that?
CodePudding user response:
You can use pd.merge
on the date and sales columns, joining on Date on the left dataframe (df1), adding the suffixes _a and _b to the resulting column names. You can then fill in any nans with 0 to get the desired output:
df1[['Date', 'sales']].merge(df2[['Date', 'sales']], on='Date', how='left',
suffixes=['_a', '_b']).fillna(0)
Output:
Date sales_a sales_b
0 2021-08-01 10000 0.0
1 2021-08-02 575 0.0
2 2021-08-03 12212 1000.0
3 2021-08-04 902 435.0
4 2021-08-05 456 759.0