How to get the index of a repeating element in list?-CodePudding

I wanted to make a Japanese transliteration program. I won't explain the details, but some characters in pairs have different values than if they were separated, so I made a loop that gets two characters (current and next)

b = "きゃきゃ"
b = list(b)
name = ""
for i in b:
    if b.index(i)   1 <= len(b) - 1:
        if i in "き / キ" and b[b.index(i)   1] in "ゃ ャ":
            if b[b.index(i)   1] != " ":
                del b[b.index(i)   1]
                del b[int(b.index(i))]
                cur = "kya"
                name  = cur
print(name)

but it always automatically giving an index 0 to "き", so i can't check it more than once. How can i change that?

I tried to delete an element after analyzing it.... but it didn't help.

CodePudding user response：

if you are looking for the indices of 'き':

b = "きゃきゃ"
b = list(b)
indices = [i for i, x in enumerate(b) if x == "き"]
print(indices)
[0, 2]

CodePudding user response：

Rather than looking ahead a character, it may be easier to store a reference to the previous character, and replacing the previous transliteration if you found a combo match.

Example (I'm not sure if I got all of the transliterations correct):

COMBOS = {('き', 'ゃ'): 'kya', ('き', 'ャ'): 'kya', ('キ', 'ゃ'): 'kya', ('キ', 'ャ'): 'kya'}
TRANSLITERATIONS = {'き': 'ki', 'キ': 'ki', 'ャ': 'ya', 'ゃ': 'ya'}


def transliterate(text: str) -> str:
    transliterated = []
    last = None
    for c in text:
        try:
            combo = COMBOS[(last, c)]
        except KeyError:
            transliterated.append(TRANSLITERATIONS.get(c, c))
        else:
            transliterated.pop()  # remove the last value that was added
            transliterated.append(combo)

        last = c

    return ''.join(transliterated)  # combine the transliterations into a single str

That being said, rather than re-inventing the wheel, it may make more sense to use an existing library that already handles transliterating Japanese to romaji, such as Pykakasi.

Example:

>>> import pykakasi

>>> kks = pykakasi.kakasi()

>>> kks.convert('きゃ')
[{'orig': 'きゃ', 'hira': 'きゃ', 'kana': 'キャ', 'hepburn': 'kya', 'kunrei': 'kya', 'passport': 'kya'}]