Given df1 and df2 dataframes, I would like to substitute the 'Name' items in df2 for their corresponding 'Area' values in df1.
In other words, the values in df2['Name] should be respectively A1 A3 A2 A1 A2
import pandas as pd
df1 = pd.DataFrame({'Name':['Tom', 'nick', 'Albert', 'Josh'],
'Area':['A1','A2','A3','A1']
})
df2 = pd.DataFrame({'Name':['Tom', 'Albert', 'Nick', 'Josh','Nick'],
'pet':['dog','cat','mouse','dog','mouse']
})
After searching for some suggestions, I added this line:
df = df2.merge(df1, on=['Area','Name'], how='left')
print(df)
But I get this error:
# Check for duplicates
KeyError: 'Area'
and this line:
df = df2['Name'].map(df1.set_index('Area')['Name'])
print(df)
outputs this error
InvalidIndexError: Reindexing only valid with uniquely valued Index objects
CodePudding user response:
That's because both dataframes should have a column named Area
.
So remove the Area
when merging.
df = df2.merge(df1, on=['Name'], how='left')
CodePudding user response:
You do not have any column named 'Area'
in your second dataframe (df2
). This is why the error is being shown.
CodePudding user response:
You can use dict and zip commands to make the 2 dataframe columns you want to map into a dictionary. Then you can map the values easily.
import pandas as pd
df1 = pd.DataFrame({
'Name':['Tom', 'nick', 'Albert', 'Josh'],
'Area':['A1','A2','A3','A1']
})
df2 = pd.DataFrame({
'Name':['Tom', 'Albert', 'Nick', 'Josh','Nick'],
'pet':['dog','cat','mouse','dog','mouse']
})
df2['Name'] = df2['Name'].map(dict(zip(df1['Name'], df1['Area'])))