I have two sets of data that represent the same information with the same key but with different names of the key. I want to map the keys with 1-1 relationship with inner join (because they are the same ID but reading from different company websites). Here is an example:
Dataset A
Dataset B (note that row for PO d is missing for ID4):
Want to map the relationship between "Key" and "ID":
So that the data set will be connected with this inner joint as:
How could I do this using python/pandas?
CodePudding user response:
Doing chain merge
dfa.merge(dfb.merge(dfc, how='inner'), how='inner')
CodePudding user response:
merge
dfA
anddfB
(on "PO Number") to get the relationship between "Key" and "ID"merge
dfA
andmapper
(on "Key") to get the output
mapper = dfA.merge(dfB)[["Key", "ID"]].drop_duplicates()
output = dfA.merge(mapper)
>>> output
Key PO Number ID
0 Key 1 a ID 4
1 Key 1 b ID 4
2 Key 1 c ID 4
3 Key 1 d ID 4
4 Key 2 e ID 1
5 Key 3 1 ID 3
6 Key 3 2 ID 3
7 Key 3 3 ID 3
8 Key 3 4 ID 3
9 Key 3 5 ID 3
Or in one line:
output = dfA.merge(dfA.merge(dfB)[["Key", "ID"]].drop_duplicates())