Home > Enterprise >  Vowel groups from string in python
Vowel groups from string in python

Time:06-17

I'm trying to extract all the vowel groups from a string and get the index of each vowel group. For ex in the word = 'britain' the vowel groups are 'i' and 'ai' and there indexes in the string are 2 and 4. I would like to create two lists that keep track of the vowels groups and the indexes in the string. Maybe there is a way to do this with regex or itertools groupby

This is my code so far:

first='phoebe'
vowels=['a','e','i','o','u']
char=""
lst=[]
for i in range(len(first)-1):
    if first[i] in vowels:
        char =first[i]
    if first[i] not in vowels:
        lst.append(char)
        char=""

CodePudding user response:

You can do this with a regex:

import re

s = 'fountain of youth'

indices = []
strings = []

for m in re.finditer(r'[aeiou] ', s):
    indices.append(m.start())
    strings.append(m.group())
    
indices, strings
# ([1, 5, 9, 13], ['ou', 'ai', 'o', 'ou'])

It wouldn't be hard to do this as a zipped iterator, but you need to be careful if the string may be without vowels

CodePudding user response:

first = 'phoebe'
vowels = ['a','e','i','o','u']
vowelGroup = ""
vowelGroups = []
indices = []
index = -1
for i in range(len(first)): #Don't do -1 here otherwise you would miss last 'e' from 'phoebe'
    if first[i] in vowels:
        vowelGroup  = first[i]
        if index == -1:
            index = i
    elif index != -1:
        vowelGroups.append(vowelGroup)
        indices.append(index)
        vowelGroup = ""
        index = -1
if index != -1:
    vowelGroups.append(vowelGroup)
    indices.append(index)
print(vowelGroups, indices)

CodePudding user response:

You could do this with itertools.groupby, grouping on whether the letter is a vowel, and then extracting the indexes and strings from the groupby object:

import itertools

first='phoebe'
vowels=['a','e','i','o','u']
vw = itertools.groupby(enumerate(first), key=lambda t:t[1] in vowels)
vgrps = [list(g) for k, g in vw if k]
indices = [g[0][0] for g in vgrps]
print(indices)
strings = [''.join(t[1] for t in g) for g in vgrps]
print(strings)

Output:

[2, 5]
['oe', 'e']
  • Related