Home > Back-end >  Python multiple substring index in string
Python multiple substring index in string

Time:01-17

Given the following list of sub-strings:

sub = ['ABC', 'VC', 'KI']

is there a way to get the index of these sub-string in the following string if they exist?

s = 'ABDDDABCTYYYYVCIIII'


so far I have tried:

for i in re.finditer('VC', s):
  print(i.start, i.end)
  

However, re.finditer does not take multiple arguments.

thanks

CodePudding user response:

You can join those patterns together using |:

import re
sub = ['ABC', 'VC', 'KI']
s = 'ABDDDABCTYYYYVCIIII'

r = '|'.join(re.escape(s) for s in sub)
for i in re.finditer(r, s):
    print(i.start(), i.end())

CodePudding user response:

You could map over the find string method.

s = 'ABDDDABCTYYYYVCIIII'
sub = ['ABC', 'VC', 'KI']

print(*map(s.find, sub))
# Output 5 13 -1

CodePudding user response:

How about using list comprehension with str.find?

s = 'ABDDDABCTYYYYVCIIII'
sub = ['ABC', 'VC', 'KI']
results = [s.find(pattern) for pattern in sub]

print(*results) # 5 13 -1

CodePudding user response:

Another approach with re, if there can be multiple indices then this might be better as the list of indices is saved for each key, when there is no index found, the substring won't be in the dict.

import re
s = 'ABDDDABCTYYYYVCIIII'
sub = ['ABC', 'VC', 'KI']

# precompile regex pattern
subpat = '|'.join(sub)
pat = re.compile(rf'({subpat})')

matches = dict()
for m in pat.finditer(s):
    # append starting index of found substring to value of matched substring
    matches.setdefault(m.group(0),[]).append(m.start()) 

print(f"{matches=}")
print(f"{'KI' in matches=}")
print(f"{matches['ABC']=}")

Outputs:

matches={'ABC': [5], 'VC': [13]}
'KI' in matches=False
matches['ABC']=[5]

CodePudding user response:

A substring may occur more than once in the main string (although it doesn't in the sample data). One could use a generator based around a string's built-in find() function like this:

note the source string has been modified to demonstrate repetition

sub = ['ABC', 'VC', 'KI']
s = 'ABCDDABCTYYYYVCIIII'

def find(s, sub):
    for _sub in sub:
        offset = 0
        while (idx := s[offset:].find(_sub)) >= 0:
            yield _sub, idx   offset
            offset  = idx   1

for ss, start in find(s, sub):
    print(ss, start)

Output:

ABC 0
ABC 5
VC 13

CodePudding user response:

Just Use String index Method

list_ = ['ABC', 'VC', 'KI']

s = 'ABDDDABCTYYYYVCIIII'


for i in list_:
    if i in s:
        print(s.index(i))
  • Related