I have the following list:
pre = ["unable to", "would not", "was not", "did not", "there is not", "could not", "failed to"]
From dataframe column I want to find texts that have the words of the list in order to generate a new column that can print these words along with the next word, for example, in a column cell there is the following text WOULD NOT PRIME CORRECTLY DURING VIRECTOMY.
, I want a new column that prints the following: WOULD NOT PRIME
.
I have tried something like this
def matcher(Event_Description):
for i in pre:
if i in Event_Description:
return i 1
return "Not found"
CodePudding user response:
You can loop over every prefix in the list and check for the prefix using .find()
. If it is found, you can change the prefix to the case of event
and append the next word. Like this:
def matcher(event):
pres = ["unable to", "would not", "was not", "did not", "there is not", "could not", "failed to"]
for pre in pres:
i = event.lower().find(pre)
if i != -1:
return ' '.join([pre.upper() if event.isupper() else pre, *event[i len(pre) 1:].split(' ')[0]])
return "Not found"
If you want to include the next two words, just change this line:
return ' '.join([pre.upper() if event.isupper() else pre, *event[i len(pre) 1:].split(' ')[0]])
to a slice like this:
return ' '.join([pre.upper() if event.isupper() else pre, *event[i len(pre) 1:].split(' ')[0:2]])