Home > Net >  Replace values in one dataframe with values from another dataframe
Replace values in one dataframe with values from another dataframe

Time:06-21

I have two dataframes and I am trying to replace all values of 0 from the first dataframe with certain values from the second dataframe.

The first dataframe looks like this:

Region Country Product Quantity Price
Africa South Africa ABC 500 1200
Africa South Africa DEF 200 400
Africa South Africa XYZ 0 0
Africa Nigeria DEF 150 450
Africa Nigeria XYZ 200 750
Asia Japan XYZ 100 500
Asia Japan ABC 200 500
Asia Japan DEF 0 0
Asia India XYZ 250 600
Asia India ABC 100 400
Asia India DEF 40 220

It contains quantities and total price for each type of product for each country. Some country and product combinations have values of 0.

The second dataframe looks like this:

Region Product Quantity Price
Africa ABC 375 825
Africa DEF 125 250
Africa XYZ 110 300
Asia XYZ 200 500
Asia ABC 200 500
Asia DEF 120 300

The second dataframe contains the region averages for quantity and price across each product.

The task is to replace all 0 values in dataframe 1 with the averages for that region and product from dataframe 2.

For example, the 0 values for South Africa for product XYZ in dataframe 1 should be replaced with the values for product XYZ for Africa in dataframe 2. The other values for South Africa should stay the same and so on.

Any tips on how to do this?

CodePudding user response:

First, get all average values merged into the first df, a joining is suffice.

df1_merged = df1.merge(df2, how='left', on=['Region', 'Product'], suffixes=('','_Average'))

we get df1_merged as

Region Country Product Quantity Price Quantity_Average Price_Average
0 Africa South Africa ABC 500 1200 375 825
1 Africa South Africa DEF 200 400 125 250
2 Africa South Africa XYZ 0 0 110 300
3 Africa Nigeria DEF 150 450 125 250
4 Africa Nigeria XYZ 200 750 110 300
5 Asia Japan XYZ 100 500 200 500
6 Asia Japan ABC 200 500 200 500
7 Asia Japan DEF 0 0 120 300
8 Asia India XYZ 250 600 200 500
9 Asia India ABC 100 400 200 500
10 Asia India DEF 40 220 120 300

Then we can fillin these zero with Average columns with conditional loc

df1_merged.loc[df1_merged['Quantity'] == 0, 'Quantity'] = \
df1_merged['Quantity_Average']

df1_merged.loc[df1_merged['Price'] == 0, 'Price'] = \
df1_merged['Price_Average']

now we get df1_merged as

Region Country Product Quantity Price Quantity_Average Price_Average
0 Africa South Africa ABC 500 1200 375 825
1 Africa South Africa DEF 200 400 125 250
2 Africa South Africa XYZ 110 300 110 300
3 Africa Nigeria DEF 150 450 125 250
4 Africa Nigeria XYZ 200 750 110 300
5 Asia Japan XYZ 100 500 200 500
6 Asia Japan ABC 200 500 200 500
7 Asia Japan DEF 120 300 120 300
8 Asia India XYZ 250 600 200 500
9 Asia India ABC 100 400 200 500
10 Asia India DEF 40 220 120 300

Finally, drop unneeded columns with

df1_merged = df1_merged.drop(['Quantity_Average', 'Price_Average'], axis=1)

and we get

Region Country Product Quantity Price
0 Africa South Africa ABC 500 1200
1 Africa South Africa DEF 200 400
2 Africa South Africa XYZ 110 300
3 Africa Nigeria DEF 150 450
4 Africa Nigeria XYZ 200 750
5 Asia Japan XYZ 100 500
6 Asia Japan ABC 200 500
7 Asia Japan DEF 120 300
8 Asia India XYZ 250 600
9 Asia India ABC 100 400
10 Asia India DEF 40 220

CodePudding user response:

You can use update after replacing 0 with np.nan and setting a common index between the two dataframes.

Be wary of two things:

  1. Use overwrite=False to only fill the null values
  2. update modifies inplace
common_index = ['Region','Product']
df_indexed = df.replace(0,np.nan).set_index(common_index)
df2_indexed = df2.set_index(common_index)

df_indexed.update(df2_indexed,overwrite=False)

print(df_indexed.reset_index())

    Region Product       Country  Quantity   Price
0   Africa     ABC  South Africa     500.0  1200.0
1   Africa     DEF  South Africa     200.0   400.0
2   Africa     XYZ  South Africa     110.0   300.0
3   Africa     DEF       Nigeria     150.0   450.0
4   Africa     XYZ       Nigeria     200.0   750.0
5     Asia     XYZ         Japan     100.0   500.0
6     Asia     ABC         Japan     200.0   500.0
7     Asia     DEF         Japan     120.0   300.0
8     Asia     XYZ         India     250.0   600.0
9     Asia     ABC         India     100.0   400.0
10    Asia     DEF         India      40.0   220.0
  • Related