How to loop through each row within a dataframe that contains a string, and match such string to eac-CodePudding

Dear Stackoverflow community-

I have a dataframe df, that has a row 'name' which contains different names in it:

print(df)

name
tom
jerry
steven
Zeo

Then I have a list with names in it:

print(list)

['tom', 'zeo']

How do I create a new column in df, df['matched'], that will return the matched value from the list to the column if matched, and nan otherwise?

name   matched
tom    tom
jerry   nan
steven  nan
Zeo     zeo

I tried:

for i in list:
    df['matched']=df['name'].str.lower().str.contains(i,case=False).map({True:i,False:np.nan})

But it does not work...

CodePudding user response：

We can try using str.extract here with a regex alternation:

names = ["tom", "zeo", ...]
regex = r'^('   r'|'.join(names)   r')$'
df["matched"] = df["name"].str.extract(regex)

Output:

     name matched
0     tom     tom
1   jerry     NaN
2  steven     NaN
3     Zeo     zeo