I am trying to remove the white space that is in this header that appears after the ":" character
batman: 100
robin: OFXSGML
superman: 102
wonderwoman: NONE
joker: USASCII
harley: 1252
aquaman: NONE
flash: NONE
iris: NONE
this is a regex pattern to match this exact header but I keep running into problems trying to delete the white space any help that can be offered is appreciated
^batman:\s100 robin:\sOFXSGML superman:\s102 wonderwoman:\s NONE joker:\sUSASCII harley: 1252 aquaman:\s NONE flash:\sNONE iris:\sNONE$
CodePudding user response:
In your pattern you are using spaces, but if you want to match all lines you can replace them with \s
every time you cross a newline.
Then you can after process it replacing :\s
with :
but note that the pattern is very precise match.
If you want to be more flexible, You can use a capture group to capture all before the :
and then match the spaces after it.
^([^\s:] :)[\p{Zs}\t] (?=\S)
The pattern matches:
^
Start of string([^\s:] :)
Capture group 1, match 1 non whitespace chars other than:
and then match the:
[\p{Zs}\t]
Match 1 spaces(?=\S)
Postive lookahead, assert a non whitespace char to the right (if there has to be one, else you can omit this part)
In the replacement use group 1 like $1
CodePudding user response:
var yourString = @"batman: 100 robin: OFXSGML superman: 102 wonderwoman: NONE joker: USASCII harley: 1252 aquaman: NONE flash: NONE iris: NONE";
yourString = Regex.Replace(yourString, "(?<=:) ", "");
CodePudding user response:
Shouldn't be any more complex than:
string source = @"
batman: 100
robin: OFXSGML
superman: 102
wonderwoman: NONE
joker: USASCII
harley: 1252
aquaman: NONE
flash: NONE
iris: NONE
".Trim();
Regex rx = new Regex(@"(?<=:)\s ");
string result = rx.Replace(source, "");
(?<=:)
is a zero-width positive lookbehind: it anchors the match on a:
, without it being a part of the match.\s
matches 1 or more whitespace characters (SP, HT, CR, LF, VT).
That changes:
batman: 100
robin: OFXSGML
superman: 102
wonderwoman: NONE
joker: USASCII
harley: 1252
aquaman: NONE
flash: NONE
iris: NONE
into
batman:100
robin:OFXSGML
superman:102
wonderwoman:NONE
joker:USASCII
harley:1252
aquaman:NONE
flash:NONE
iris:NONE
Alternatively, you can include the :
in the match. It just changes the replacement text:
Regex rx = new Regex(@":\s ");
string result = rx.Replace(source, ":");
If you care about the value of the key preceding the colon-plus-whitespace, use named capture groups and a match evaluator.
Here the regular expression (?<key>\w )\s*:\s*
matches:
(?<key>\w )
— a sequence of 1 or more whitespace characters (letters, digits or_
), followed by\s*
— zero or more whitespace characters, followed by:
— a literal colon character, followed by\s*
— zero or more whitespace characters
The match evaluator looks at the capturing group named key. If it is any of batman
, robin
, or superman
, any whitespace preceding or following the colon is removed; otherwise, the match itself is returned unchanged.
Regex rx = new Regex(@"(?<key>\w )\s*:\s*");
string result = rx.Replace(source, (Match m) => {
string replacement;
string key = m.Groups["key"].Value;
switch (key) {
case "batman":
case "robin":
case "superman":
replacement = key ":";
break;
default:
replacement = m.Value;
break;
}
return replacement;
});