Dear Stackoverflow community-
I have a dataframe df, that has a row 'name' which contains different names in it:
print(df)
name
tom
jerry
steven
Zeo
Then I have a list with names in it:
print(list)
['tom', 'zeo']
How do I create a new column in df, df['matched'], that will return the matched value from the list to the column if matched, and nan otherwise?
name matched
tom tom
jerry nan
steven nan
Zeo zeo
I tried:
for i in list:
df['matched']=df['name'].str.lower().str.contains(i,case=False).map({True:i,False:np.nan})
But it does not work...
CodePudding user response:
We can try using str.extract
here with a regex alternation:
names = ["tom", "zeo", ...]
regex = r'^(' r'|'.join(names) r')$'
df["matched"] = df["name"].str.extract(regex)
Output:
name matched
0 tom tom
1 jerry NaN
2 steven NaN
3 Zeo zeo