Home > database >  Extracting integers from a string and forming actual numbers from a list of these integers
Extracting integers from a string and forming actual numbers from a list of these integers

Time:03-09

I have a string bar:

bar = 'S17H10E7S5E3H2S105H90E15'

I take this string and form groups that start with the letter S:

groups = ['S'   elem for elem in bar.split('S') if elem != '']
groups
['S17H10E7', 'S5H3E2', 'S105H90E15']

Without using the mini-language RegEx, I'd like to be able to get the integer values that follow the different letters S, H, and E in these groups. To do so, I'm using:

code = 'S'
temp_num = []

for elem in groups:
    start = elem.find(code)
    for char in elem[start   1: ]:
        if not char.isdigit():
            break
        else:
            temp_num.append(char)
            num_tests = ','.join(temp_num)

This gives me:

print(groups)
['S17H10E7', 'S5H3E2', 'S105H90E15']

print(temp_num)
['1', '7', '5', '1', '0', '5']

print(num_tests)
1,7,5,1,0,5

How would I take these individual integers 1, 7, 5, 1, 0, and 5 and put them back together to form a list of the digits following the code S? For example:

[17, 5, 105]

CodePudding user response:

Would something like this work?

def get_numbers_after_letter(letter, bar):
    current = True
    out = []
    for x in bar:
        if x==letter:
            out.append('')
            current = True
        elif x.isnumeric() and current:
            out[-1]  = x
        elif x.isalpha() and x!=letter:
            current = False
    return list(map(int, out))

Output:

>>> get_numbers_after_letter('S', bar)
[17, 5, 105]

>>> get_numbers_after_letter('H', bar)
[10, 3, 90]

>>> get_numbers_after_letter('E', bar)
[7, 2, 15]

I think it's better to get all the numbers after every letter, since we're making a pass over the string anyway but if you don't want to do that, I guess this could work.

CodePudding user response:

The question states that you would favour a solution without using regex ("unless absolutely necessary" from the comments)

It is not necessary of course, but as an alternative for future readers you can match S and capture 1 or more digits using (\d ) in a group that will be returned by re.findall.

import re

bar = 'S17H10E7S5E3H2S105H90E15'
print(re.findall(r"S(\d )", bar))

Output

['17', '5', '105']
  • Related