Home > Enterprise >  Using python re.sub, but it replace the start and end unexpected
Using python re.sub, but it replace the start and end unexpected

Time:08-05

I have this string a = "a:b/c\\" and I want to replace : / \\ to _ together

This is my code

b = re.sub(r'[:/\\]*', '_', a)

However, the result is ''_a__b__c__'' and I think it should be a_b_c_ but this method replace the start and end together, how could I change this?

a = "a:b/c\\"
b = re.sub(r'[:/\\]*', '_', a)
print(b)

CodePudding user response:

You're using a character class [] which matches any single character from within that class. However this presents two issues in your particular scenario:

  1. You've got a two-character-long pattern you're trying to match \\
  2. You've quantified it with a *, which means "zero or more matches" - at its core your pattern will now basically match on anything since this character class you've declared is now effectively optional.

The solution here is to (a) use a group and alternatives instead of a character class, and (b) eliminate the misused * quantifier:

import re
a = "a:b/c\\"
b = re.sub(r'(:|/|\\)', '_', a)
print(b) # 'a_b_c_'

Regex101 - this differs slightly because the tool itself does not respect the raw r'' string that Python uses to eliminate the need for escaping the backslash \ characters, regardless it illustrates fundamentally what's happening here.

CodePudding user response:

I have change re.sub(r'[:|/|\\]*', '_', a) to re.sub(r'[:|/|\\] ', '_', a) this problem solved, means it need to exist 1 or more.

  • Related