Home > front end >  How to find index positions of a substring using Python
How to find index positions of a substring using Python

Time:09-13

Very new to Python here, and struggling. Any help is appreciated! Confession: this is obviously a request for help with homework, but my course ends tomorrow and the instructor takes too long to return a message, so I'm afraid if I wait I won't get this finished in time.

I'm using a learning module from Cornell University called introcs. It's documented here: http://cs1110.cs.cornell.edu/docs/index.html

I am trying to write a function that returns a tuple of all indexes of a substring within a string. I feel like I'm pretty close, but just not quite getting it. Here's my code:


import introcs 

def findall(text,sub):
    result = ()
    x = 0
    pos = introcs.find_str(text,sub,x)

    for i in range(len(text)):
        if introcs.find_str(text,sub,x) != -1:
            result = result   (introcs.find_str(text,sub,x), )
            x = x   1   introcs.find_str(text,sub,x)

    return result

On the call findall('how now brown cow', 'ow') I want it to return (1, 5, 10, 15) but instead it lops off the last result and returns (1, 5, 10) instead.

Any pointers would be really appreciated!

CodePudding user response:

You can use re to do it:

import re

found = [i.start() for i in re.finditer(substring, string)]

CodePudding user response:

You don't need to loop over all the characters in text. Just keep calling introcs.find_str() until it can't find the substring and returns -1.

Your calculation of the new value of x is wrong. It should just be 1 more than the index of the previous match.

Make result a list rather than a tuple so you can use append() to add to it. If you really need to return a tuple you can use return tuple(result) at the end to convert it.

def findall(text,sub):
    result = []
    x = 0
    while True:
        pos = introcs.find_str(text,sub,x)
        if pos == -1:
            break
        result.append(pos)
        x = pos   1

    return result

CodePudding user response:

Your code shows evidence of three separate attempts of keeping track of where you are in the string:

  1. you loop over it with i
  2. you put the position a sub was found at in pos
  3. you compute an x

The question here is what do you want to happen in this case:

findall('abababa', 'aba')

Do you expect [0, 4] or [0, 2, 4] as a result? Assuming find_str works just like the standard str.find() and you want the [0, 2, 4] result, you can just start the next search at 1 position after the previously found position, and start searching at the start of the string. Also, instead of adding tuples together, why not build a list:

# this replaces your import, since we don't have access to it
class introcs:
    @staticmethod
    def find_str(text, sub, x):
        # assuming find_str is the same as str.find()
        return text.find(sub, x)


def findall(text,sub):
    result = []
    pos = -1

    while True:
        pos = introcs.find_str(text, sub, pos   1)
        if pos == -1:
            break
        result.append(pos)

    return result


print(findall('abababa', 'aba'))

Output:

[0, 2, 4]

If you only want to match each character once, this works instead:

def findall(text,sub):
    result = []
    pos = -len(sub)

    while True:
        pos = introcs.find_str(text, sub, pos   len(sub))
        if pos == -1:
            break
        result.append(pos)

    return result

Output:

[0, 4]
  • Related