I have two dataframes DB
and compareDF
. I'm trying to compare and match these two dataframes at same value phone_number
, which works, but after the row matching I would like to add customer_code
(another column) from DB
to compareDF['cmp2']
where the value is matched, which doesn't works (It does not add customer_code from matched row but instead it adds values from customer_code one after another from the start of DB
)
m1 = compareDF['phone_number'].isin(DB['phone_number'])
compareDF['cmp2'] = DB['customer_code'].where(m1)
I have these
# DB
0 phone_number customer_code
1 055 1
2 077 2
3 088 3
# compareDF
0 phone_number cmp2
1 077
2 088
3 055
I want this
# compareDF
0 phone_number cmp2
1 077 2
2 088 3
3 055 1
I'm getting this
# compareDF
0 phone_number cmp2
1 077 1
2 088 2
3 055 3
CodePudding user response:
You need to use merge
and join the two data sets on the phone_number
key. The compareDF
should be the primary table in this join as well.
df1.merge(df, how="inner", on="phone_number").rename(columns={"customer_code": "cmp2"})
# where
df = pd.DataFrame({"phone_number": ["055", "077", "088"], "customer_code": [1, 2, 3]})
df1 = pd.DataFrame({"phone_number": ["077", "088", "055"]})
# gives
phone_number cmp2
0 077 2
1 088 3
2 055 1
CodePudding user response:
This can be done directly with merge,and you can modify the name of the column by df.columns = [xxx]:
res = pd.merge(db, compareDF)
res.columns = ["phone_number", "cmp2"]
print(res)