Home > Software engineering >  Python: Regex remove substring starting from specific character till any alphabet
Python: Regex remove substring starting from specific character till any alphabet

Time:02-03

Need a python regex to remove string starting from specific charcater till any aplhabet found.

Example:

hello world\r\n Processing ....Pass
hello world\r\n Processing .Fail
hello world\r\n Processing ......Error
hello world\r\n Processing ..Fail
hello world\r\n Processing .......<Any string>

Result should be:

hello world\r\n <Any String>

here dot after Processing could be any of the number and want to remove Processing ..(n times dot)

Basically I want to remove anything between \r\n to [A-Z] pattern but not the pattern

I tried this but it is also removing the pattern.

(?s)\\r\\n.*?\.[A-Z][^\w\s]

CodePudding user response:

You can search using this regex:

(?s)(?<=\\r\\n ). ?(?=[A-Z])

and replace with just an empty string.

RegEx Demo

RegEx Breakdown:

  • (?s): Enable DOTALL (single line) mode
  • (?<=\\r\\n ): Positive lookbehind to assert that we have literal text \r\n and a space before the current position
  • . ?: Match 1 of any characters
  • (?=[A-Z]): Lookahead to assert that we have an uppercase letter at next position
  • Related