Home > Software design >  Extracting 2 digits numbers from string
Extracting 2 digits numbers from string

Time:11-27

I have a file which contains string, from every string I need to append to my list every 2 digit number. Here's the file content: https://pastebin.com/N6gHRaVA

I need to iterate every string and check if string on index[i] and on index[i 1] is digit, if yes, append those digits to list and slice the string from those 2 digits number,

for example the string:

string = '7469NMPLWX8384RXXOORHKLYBTVVXKKSRWEITLOCWNHNOAQIXO' should work in this way:

  1. Okay I have found digit 74, add 74 to my list and slice the string from 74 to the end
  2. My string is now 69NMPLWX8384RXXOORHKLYBTVVXKKSRWEITLOCWNHNOAQIXO, I have found digit 69,add 69 to list and slice the string until I will find new 2-number digit. The problem is I always have error:
        if string[i].isdigit() and string[i 1].isdigit():
                               ~~~~~~^^^^^
IndexError: string index out of range
f = open("file.txt")
read = f.read().split()
f.close()
for string in read:
    l = list()
    i = 0
    print(string)
    while i<len(string):
        if string[i].isdigit() and string[i 1].isdigit():
            l.append(string[i]   string[i 1])
            string = string[i 2:]
            i = 0
        elif i==len(string)-1:
            break
        else:
            i =1
print(l)

My program stops at string in line 31, which is the string: 'REDOHGMDPOXKFMHUDDOMLDYFAFYDLMODDUHMFKXOPDMGHODER5'

I have no idea how to do this slice iteration, and please, don't use regex.

CodePudding user response:

You're going off the end of the string... Change:

 while i<len(string):

to:

 while i<len(string)-1:

And you should be fine.

If you were just looking at one character at a time, you could use your original while. The trick here is that you're always looking at a char and also "one ahead" of the char. So you have to shorten your check by one iteration to prevent going past the last char to check.

CodePudding user response:

You could use recursion. Here is what it would look like to deal with one of the strings.

Part of the code:

my_string = '7469NMPLWX8384RXXOORHKLYBTVVXKKSRWEITLOCWNHNOAQIXO'
result_list = []

def read_string(s):
    result = ""
    for i,j in enumerate(s):
        if i>0 and s[i-1].isdigit() and s[i].isdigit():
            result = s[i-1]   s[i]
            result_list.append(result)
            read_string(s[i 1:])
            break;
            
    return (result_list)        
     
# Call the read_string function
x = read_string(my_string) 
print(x)    

OUTPUT:

['74', '69', '83', '84']

CodePudding user response:

Your loop condition i len(string). If string is not empty, this equals a positive intiger, which is evaluated as True. Hence, you created an endless loop, that meets it's end, when i gets greater then string length. Try this:

while i < len(string) -1:

EDITED:
Apparently, i didn't notice which string gave you the error. As you check for i 1th element of string, when we star checking the last character, reaching for the next one gives an obvious error. So, there should be -1 in the condition.

  • Related