I want a new column in a df that shows two options when executing a function.
I have the following lists:
lista = [A, B, C, D]
listb = [use to, manage to, when, did not]
If in a column called "Description" that contains texts in each cells is it found any word of "lista" return that value in a new column called "Effect" if this is not found, then search for values of "listb" and return that value in the same column "Effect" but with additional next 1 or 2 strings of that word, for example the word from listb "use to" with it next word fail in order I can get the complete setence "use to fail" in the column.
I have tried something like this:
def matcher(Description):
for i in lista:
if i in Description:
return i
return "Not found"
def matcher(Description):
for j in listb:
if j in Description:
return j 1
return "Not found"
df["Effect"] = df.apply(lambda i: matcher(i["Description"]), axis=1)
df["Effect"] = df.apply(lambda j: matcher(j["Description"]), axis=1)
CodePudding user response:
You can do both at once:
def matcher(Description):
w = [i for i in lista if i in Description]
w.extend( [i for i in listb if i in Description] )
if not w:
return "Not found"
else:
return ' '.join(w)
df["Effect"] = df.apply(lambda i: matcher(i["Description"]), axis=1)
CodePudding user response:
The code below should do what you want to achieve:
def matcher(sentence):
match_list = [substr for substr in lista if substr in sentence.split(" ")]
if match_list: # list with items evaluates to True
return match_list[0]
match_list = [substr for substr in listb if substr in sentence]
if match_list:
substr = match_list[0]
return substr " " sentence.split(substr)[-1].strip().split(" ")[0]
return "Not found"
df["Effect"] = df.Description.apply(matcher)