Home > Software design >  Pandas - Set value in column by looking up in another dataframe [duplicate]
Pandas - Set value in column by looking up in another dataframe [duplicate]

Time:09-21

I have two dataframes that have the following structure.

Dataframe A:

id date price
1 2021-09-01 null
1 2021-09-02 null
2 2021-09-01 null
2 2021-09-02 null
3 2021-09-01 null
3 2021-09-02 null

Dataframe B:

id price
1 100
2 200
3 300

I need to set the price in dataframe A, for each id, to the same value as the id has in dataframe B, regardless of the date in dataframe A.

So expected result is the following:

id date price
1 2021-09-01 100
1 2021-09-02 100
2 2021-09-01 200
2 2021-09-02 200
3 2021-09-01 300
3 2021-09-02 300

The data set is very large so need something efficient.

Happy to hear your suggestions :)

CodePudding user response:

Use set_index and fillna:

>>> df1.set_index('id').fillna({'price': df2.set_index('id')['price']}).reset_index()
   id        date  price
0   1  2021-09-01  100.0
1   1  2021-09-02  100.0
2   2  2021-09-01  200.0
3   2  2021-09-02  200.0
4   3  2021-09-01  300.0
5   3  2021-09-02  300.0
>>> 

CodePudding user response:

Use pd.merge:

>>> dfA.drop(columns='price').merge(dfB, on='id', how='left')

   id        date  price
0   1  2021-09-01    100
1   1  2021-09-02    100
2   2  2021-09-01    200
3   2  2021-09-02    200
4   3  2021-09-01    300
5   3  2021-09-02    300

Please read carefully: Pandas Merging 101

CodePudding user response:

mapping may be useful as well:

dfA = pd.DataFrame(columns=["id","date", "price"],
                   data=[[1, "2021-09-01", "N/A"],
                         [1, "2021-09-02", "N/A"],
                         [2, "2021-09-01", "N/A"],
                         [2, "2021-09-02", "N/A"],
                         [3, "2021-09-01", "N/A"],
                         [3, "2021-09-02", "N/A"]])

dfB = pd.DataFrame(columns=["id", "price"],
                   data=[[1, 100],
                         [2, 200],
                         [3, 300]])

map_dict = dfB.set_index("id").to_dict()["price"]
dfA["price"] = dfA["id"]
dfA["price"] = dfA["price"].map(map_dict)

  • Related