Home > Software design >  Compare two dataframes and add values from another column where rows matched
Compare two dataframes and add values from another column where rows matched

Time:02-24

I have two dataframes DB and compareDF. I'm trying to compare and match these two dataframes at same value phone_number, which works, but after the row matching I would like to add customer_code (another column) from DB to compareDF['cmp2'] where the value is matched, which doesn't works (It does not add customer_code from matched row but instead it adds values from customer_code one after another from the start of DB)

m1 = compareDF['phone_number'].isin(DB['phone_number'])
compareDF['cmp2'] = DB['customer_code'].where(m1)

I have these

# DB

0 phone_number  customer_code
1     055            1
2     077            2
3     088            3
# compareDF

0 phone_number     cmp2
1     077            
2     088            
3     055            

I want this

# compareDF

0 phone_number     cmp2
1     077            2 
2     088            3
3     055            1

I'm getting this

# compareDF

0 phone_number     cmp2
1     077            1 
2     088            2
3     055            3

CodePudding user response:

You need to use merge and join the two data sets on the phone_number key. The compareDF should be the primary table in this join as well.

df1.merge(df, how="inner", on="phone_number").rename(columns={"customer_code": "cmp2"})

# where

df = pd.DataFrame({"phone_number": ["055", "077", "088"], "customer_code": [1, 2, 3]})
df1 = pd.DataFrame({"phone_number": ["077", "088", "055"]})

# gives

  phone_number  cmp2
0          077     2
1          088     3
2          055     1

CodePudding user response:

This can be done directly with merge,and you can modify the name of the column by df.columns = [xxx]:

res = pd.merge(db, compareDF)
res.columns = ["phone_number", "cmp2"]
print(res)
  • Related