I have two dataframes that have the following structure.
Dataframe A:
id | date | price |
---|---|---|
1 | 2021-09-01 | null |
1 | 2021-09-02 | null |
2 | 2021-09-01 | null |
2 | 2021-09-02 | null |
3 | 2021-09-01 | null |
3 | 2021-09-02 | null |
Dataframe B:
id | price |
---|---|
1 | 100 |
2 | 200 |
3 | 300 |
I need to set the price in dataframe A, for each id, to the same value as the id has in dataframe B, regardless of the date in dataframe A.
So expected result is the following:
id | date | price |
---|---|---|
1 | 2021-09-01 | 100 |
1 | 2021-09-02 | 100 |
2 | 2021-09-01 | 200 |
2 | 2021-09-02 | 200 |
3 | 2021-09-01 | 300 |
3 | 2021-09-02 | 300 |
The data set is very large so need something efficient.
Happy to hear your suggestions :)
CodePudding user response:
Use set_index
and fillna:
>>> df1.set_index('id').fillna({'price': df2.set_index('id')['price']}).reset_index()
id date price
0 1 2021-09-01 100.0
1 1 2021-09-02 100.0
2 2 2021-09-01 200.0
3 2 2021-09-02 200.0
4 3 2021-09-01 300.0
5 3 2021-09-02 300.0
>>>
CodePudding user response:
Use pd.merge
:
>>> dfA.drop(columns='price').merge(dfB, on='id', how='left')
id date price
0 1 2021-09-01 100
1 1 2021-09-02 100
2 2 2021-09-01 200
3 2 2021-09-02 200
4 3 2021-09-01 300
5 3 2021-09-02 300
Please read carefully: Pandas Merging 101
CodePudding user response:
mapping may be useful as well:
dfA = pd.DataFrame(columns=["id","date", "price"],
data=[[1, "2021-09-01", "N/A"],
[1, "2021-09-02", "N/A"],
[2, "2021-09-01", "N/A"],
[2, "2021-09-02", "N/A"],
[3, "2021-09-01", "N/A"],
[3, "2021-09-02", "N/A"]])
dfB = pd.DataFrame(columns=["id", "price"],
data=[[1, 100],
[2, 200],
[3, 300]])
map_dict = dfB.set_index("id").to_dict()["price"]
dfA["price"] = dfA["id"]
dfA["price"] = dfA["price"].map(map_dict)