What this code does is extract a verb and the information that follows after it. Then create a .txt file with the name of the verb and write the information inside.
I have to run to win the race
import re, os
regex_patron_01 = r"\s*\¿?(?:have to|haveto|must to|mustto)\s*((?:\w\s*) )\?"
n = re.search(regex_patron_01, input_text_to_check, re.IGNORECASE)
if n:
word, = n.groups()
try:
word = word.strip()
except AttributeError:
print("no verb specified!!!")
regex_patron_01 = r"\s*((?:\w )?) \s*((?:\w\s*) )\s*\??"
n = re.search(regex_patron_01, word, re.IGNORECASE)
if n:
#This will have to be repeated for all the verbs that are present in the sentence.
verb, order_to_remember = n.groups()
verb = verb.strip()
order_to_remember = order_to_remember.strip()
target_file = target_file verb ".txt"
with open(target_file, 'w') as f:
f.write(order_to_remember)
This make a "run.txt", and white in this file : "to win the race"
but now I need that in addition to that, the regex can be extended to the possibility that there is more than one verb, for example
I have to run, jump and hurry to win the race
In that case you should create 3 files, one with the name "run.txt", another with the name "jump.txt", and another with the name "hurry.txt", and in each of them write the line "to win the race.
The problem I'm having is how to make it repeat the process whenever a comma (,) or an "and" follows a verb.
Other example:
I have to dance and sing more to be a pop star
And make 2 files, "dance.txt" and "sing.txt", and both with the line "more to be a pop star"
CodePudding user response:
I simplified the search and conditions somewhat and did this:
def fun(x):
match=re.search(r"(?<=have to) ([\w\s,] ) (to [\w\s] )",x)
if match:
for i in re.split(',|and',match[1]):
with open(f'{i}.txt','w') as file:
file.write(match[2])
If there is a match, the function will create one or more 'txt' files, with the caught verbs as its names. If there is no match - it'll do nothing.
The regex I used is looking for two groups. The first must be preceded by "have to" and may contain words and whitespaces separated by comma or "and". The second group should start with "to " and can contain only words and whitespaces.
match[0] is a whole match
match[1] is the first group
match[2] is the second group
The 'for' loop iterates through the list obtained by separating the first group using comma and 'and' as separators. At each iteration a file with the name from this list is created.