How to get the last letter in a string that is not a specific character using regex?-CodePudding

This is a related question but doesn't specifically have the answer I'm looking for. How to search for the last occurrence of a regular expression in a string in python?

Essentially, say for input "iuasgjjgg", I want to find the rightmost character that is not "j".

solution("abcdefghijh") => 10 (because h is at index 10)

solution("hhhjjjhhh") => 8 (because last h is at index 8)

the fact that it is "h" is not important, it's just important that it's not "j"

I think it has something to do with re.search and negative lookaheads but I'm having trouble putting everything together.

edit: also I know that this is probably possible using a naive approach but I'm looking for a regex solution.

CodePudding user response：

One approach would be to rstrip the letter j, and then take the final remaining character:

inp = ["abcdefghijh", "hhhjjjhhh"]
last = [x.rstrip('j')[-1] for x in inp]
print(last)  # ['h', 'h']

If instead you want the final index of the above characters, just take the length:

inp = ["abcdefghijh", "hhhjjjhhh"]
last = [len(x.rstrip('j')) - 1 for x in inp]
print(last)  # [10, 8]

CodePudding user response：

Count the rightmost characters that match your blacklist and subtract from the length:

from itertools import takewhile

def solution(s):
    return len(s) - len(tuple(takewhile(lambda x: x=='j', reversed(s))))-1


solution("abcdefghijh")
# 10

solution("abcdefghijhjjjjjj")
# 10

CodePudding user response：

A possible approach with regex:

import re

def solution(input_text):
    my_matches = re.search("(.*)[^j]", input_text)
    return len(my_matches[0]) - 1 

print(solution("abcdefghijh")) # 10
print(solution("hhhjjjhhh")) # 8

Not sure what to do with empty string or a string consisting of only 'j's, though. There is no valid index. Other answers give '-1' while this one throws an error.

EDIT: It's easy to return -1 when no match is found, for example:

def solution(input_text):
    my_matches = re.search("(.*)[^j]", input_text)
    if not my_matches:
        return -1
    return len(my_matches[0]) - 1