Home > Back-end >  Python - How to split a string until an occurence of an integer?
Python - How to split a string until an occurence of an integer?

Time:11-08

In Python, I am trying to split a string until an occurence of an integer, the first occurence of integer will be included, rest will not.

Example strings that I will have are shown below:

SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2 3 ---
SOME OTHER STRING (PARANTHESIS AGAIN) 5 --- 3
AND SOME OTHER (AGAIN) 2 1 4

And the outputs that I need for these examples are going to be:

SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2
SOME OTHER STRING (PARANTHESIS AGAIN) 5
AND SOME OTHER (AGAIN) 2

Structure of all input strings will be in this format. Any help will be appreciated. Thank you in advance.

I've basically tried to split it with using spaces (" "), but it of course did not work. Then, I tried to split it with using "---" occurence, but "---" may not exist in every input, so I failed again. I also referred to this: How to split a string into a string and an integer? However, the answer suggests to split it using spaces, so it didn't help me.

CodePudding user response:

It's ideal case for regular expression.

import re

s = "SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2 3 ---"
m = re.search(r".*?[0-9] ", s)
print(m.group(0))

Explanation:

  • .* matches any number of characters
  • ? tells to not be greedy (without it it will stop in last integer)
  • [0-9] - matches one or more digits

It can be done without regular expressions too:

result = []
for word in s.split(" "):
    result.append(word)
    if word.isdigit(): # it returns True if string can be converted to int
        break
print(" ".join(result))

CodePudding user response:

Solution without re:

lst = [
    "SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2 3 ---",
    "SOME OTHER STRING (PARANTHESIS AGAIN) 5 --- 3",
    "AND SOME OTHER (AGAIN) 2 1 4",
]

for item in lst:
    idx = next(idx for idx, ch in enumerate(item) if ch.isdigit())
    print(item[: idx   1])

Prints:

SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2
SOME OTHER STRING (PARANTHESIS AGAIN) 5
AND SOME OTHER (AGAIN) 2

CodePudding user response:

try the following regular expression:

import re
r = re.compile('(\D*\d ).*')
r.match('SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2 3 -').groups()[0]
==> 'SOME STRING (IT WILL ALWAYS END WITH PARANTHESIS) 2'
  • Related