Write value to new column if value in column is in a list in pandas-CodePudding

I am trying to use list comprehension for some complex column creation in pandas.

For instance, I am trying to use a list as a reference to create another column in a pandas data frame:

fruit  = ['watermelon', 'apple', 'grape']

string                     new_column
watermelons are cool      watermelon
apples are good           apple
oranges are on sale       NaN

I tried to use list comprehension -

df['new_column'] = [f in fruit if any(f in s for f in fruit) for s in df['string']]

I don't think this is correct, would need some help!

CodePudding user response：

Best is to use str.extract:

fruit  = ['watermelon', 'apple', 'grape']

import re
df['new_column'] = df['string'].str.extract(f"({'|'.join(map(re.escape, fruit))})")

output:

                 string  new_column
0  watermelons are cool  watermelon
1       apples are good       apple
2   oranges are on sale         NaN

CodePudding user response：

This will do the job:

import pandas as pd 
import numpy as np
fruit  = ['watermelon', 'apple', 'grape']
df = pd.DataFrame()
df['string'] = ['watermelons are cool', 'apples are good', 'oranges are on sale', 'apples are not watermelons']

output = df['string'].apply(lambda x: ','.join([f for f in fruit if f in x]))
output[output == ''] = np.nan

print(output)

Output:

0          watermelon
1               apple
2                 NaN
3    watermelon,apple
Name: string, dtype: object