I want to separate strings by separators constist of more than one char saved in the variable sep_list
.
My aim then is to receive the last separated string s1
and the last separator which has s1
on his right hand side.
sep_list = ['→E', '¬E', '↓I']
string1 = "peter →E tom ¬E luis ↓I ed"
string2 = "sigrid →E jose l. ¬E jose t."
Applied on string1
the algorithm should return the string s1
:
"↓I, ed"
and applied on string2
the algorithm should return the string s1
:
"¬E, jose t."
What is a way to do that with python?
CodePudding user response:
Another way to do so using regex:
import re
sep_list = ['→E', '¬E', '↓I']
string1 = "peter →E tom ¬E luis ↓I ed"
string2 = "sigrid →E jose l. ¬E jose t."
def separate_string(data, seps):
pattern = "|".join(re.escape(sep) for sep in seps)
start, end = [m.span() for m in re.finditer(pattern, data)][-1]
return f"{data[start:end]},{data[end:]}"
print(separate_string(string1, sep_list)) # ↓I, ed
print(separate_string(string2, sep_list)) # ¬E, jose t.
- We create a regex pattern by separating each keyword with
|
. - For each match in the string, we use
m.span()
to retrieve the start and end of the match. We only keep the last match. data[start:end]
is the separator, whiledata[end:]
is everything after.
CodePudding user response:
Assuming the separators may exist in any order (or not at all), you could do this:
sep_list = ['→E', '¬E', '↓I']
string1 = "peter →E tom ¬E luis ↓I ed"
string2 = "sigrid →E jose l. ¬E jose t."
def process(s):
indexes = []
for sep in sep_list:
if (index := s.find(sep)) >= 0:
indexes.append((index, sep))
if indexes:
indexes.sort()
t = indexes[-1]
return f"{t[1]},{s[t[0] len(t[1]):]}"
print(process(string1))
print(process(string2))
Output:
↓I, ed
¬E, jose t.
CodePudding user response:
Update: This solution does not need the re module! Update #2: Shorter solution.
def run(string):
sep_lst = ['→E', '¬E', '↓I']
tokens = string.split()
result = None
for i,token in enumerate(tokens):
if token in sep_lst:
result = f'{tokens[i]}, {" ".join(tokens[i 1:])}'
return result
print(run("peter →E tom ¬E luis ↓I ed"))
print(run("sigrid →E jose l. ¬E jose t."))
Output:
↓I, ed
¬E, jose t.