I have a dataframe in pandas like this
name url
pau lola www.paulola.com
pou gine www.cheeseham.com
pete raj www.pataraj.com
And I want to check if any of the strings in the column name are in the column url (so ignoring spaces). So something like this
name url result
pau lola www.paulola.com True
pou gine www.cheeseham.com False
pete raj www.pataraj.com True
Is there any way to do it? I've tried to do with this lambda function but only works if contains both
name url namewospaces
pau lola www.paulola.com paulola
pou gine www.cheeseham.com pougine
pete raj www.pataraj.com peteraj
df['result'] = df.apply(lambda x: str(x.namewospaces) in str(x.url), axis=1)
name url namewospaces result
pau lola www.paulola.com paulola True
pou gine www.cheeseham.com pougine False
pete raj www.pataraj.com peteraj False
Thank you all :)
CodePudding user response:
split
the name into substrings, and use a list comprehension with any
to get True is any string matches:
df['result'] = [any(s in url for s in lst)
for lst, url in zip(df['name'].str.split(), df['url'])]
the (slower) equivalent with apply
would be:
df['result'] = df.apply(lambda x: any(s in x['url']
for s in x['name'].split()), axis=1)
output:
name url result
0 pau lola www.paulola.com True
1 pou gine www.cheeseham.com False
2 pete raj www.pataraj.com True