I have a list
my_list = ['element1 line','element2 ','element3', 'element4 line',....]
and I have a pandas dataframe having df
[Sentences
] column and df['flag']
column
df
Sentences flag
0 abcd
1 efgh
2 element1 ijkl
3 mnop element3 element4
4 qrst
I want to iterate to each and every row of dataframe of column Sentences
. If any of the elements in my_list is present in the Sentences
, df['flag']
column should be 1 in the respective row. If no elements is present in the string of sentences in that row, df['flag']
should be 0 for that row.
Expected output:
df
Sentences flag
0 abcd 0
1 efgh 0
2 element1 ijkl 1
3 mnop element3 element4 1
4 qrst 0
CodePudding user response:
df['flag'] = df['Sentences'].apply(lambda x: 1 if x in my_list else 0)
CodePudding user response:
You need to use a loop:
df['flag'] = [int(any(w in my_list for w in x.split())) for x in df['Sentences']]
output:
Sentences flag
0 abcd 0
1 efgh 0
2 element1 ijkl 1
3 mnop element3 element4 1
4 qrst 0
Note that you could use pure pandas, but this is much slower:
df['flag'] = (df['Sentences']
.str.split()
.explode().isin(my_list)
.groupby(level=0).any().astype(int)
)