How can I replace a string match with part of itself in Python?-CodePudding

I need to process text in Python and replace any occurrence of "[xz]" by "x", where "x" is the first letter enclosed in the brackets, and "z" can be a string of variable length. Note that I do not want the brackets in the output.

For example, "alEhos[cr@e]sjt" should become "alEhoscsjt"

I think re.sub() could be a way to go, but I am not sure how to implement it.

CodePudding user response：

This will work for the example given.

import re

example = "alEhos[cr@e]sjt"
result = re.sub(r'(.*)\[(.).*\](.*)', r'\1\2\3', example)
print(result)

The regular expression uses three capturing groups. \1 and \3 capture the text before and after the square brackets. \2 captures the first character inside the bracket.

Output:

alEhoscsjt

If you have more than one occurrence of square brackets in your string, you can use the following:

example = "alEhos[cr@e]sjt[abc]xyz"
result = re.sub(r'\[(.).*?\]', r'\1', example)
print(result)

This version replaces all of the bracketed substrings (including brackets) by the first character found inside the brackets. (Note the use of the non-greedy qualifier to avoid consuming everything between the first [ and last ].)

Output:

alEhoscsjtaxyz

CodePudding user response：

Instead of directly using the re.sub() method, you can use the re.findall() method to find all substrings (in a non-greedy fashion) that begins and ends with the proper square brackets.

Then, iterate through the matches and use the str.replace() method to replace each match in the string with the second character in the match:

import re

s = "alEhos[cr@e]sjt"

for m in re.findall("\[.*?\]", s):
    s = s.replace(m, m[1])

print(s)

Output:

alEhoscsjt

CodePudding user response：

You could use the split() method:

str1 = "alEhos[cr@e]sjt"

lst1 = str1.split("[")
lst2 = lst1[1].split("]")

print(lst1[0] lst2[0][0] lst2[1])