Home > Mobile >  Can we replace a specific part of a string literal using any predefined function and regex
Can we replace a specific part of a string literal using any predefined function and regex

Time:06-10

I want to replace "&" with a random word "$d" in a given sentence. Can we replace only those words which start with & and are followed by a single character and a space?

Example:-

Input:-

Two literals are &a and &b and also check &abc and &bac here.

Output:-

Two literals are $da and $db and also check &abc and &bac here.

In the above example in input, the only words that should be replaced are &a and &b(not the complete word should be replaced, only just the '&' in both the words) because these two random words start with & and are followed by a single character and a space.

In the case of the replaceAll() function, it replaces the entire word when I used regex:-

String str="Two literals are &a and &b and also check &abc and &bac here.";

str = str.replaceAll("\\&[a-zA-Z]{1}\\s", "\\$d");

System.out.println(str);

//output for this:-Two literals are $d and $d and also check &abc and &bac here.

//expected output:-Two literals are $da and $db and also check &abc and &bac here.

CodePudding user response:

The correct code for this would be

str.replaceAll("&([a-zA-Z]\\s)", "\\$d$1")

This is an example of backreferencing captured groups in regex, and a here is a nice reference for it. Additionally, here's a relevant StackOverflow question about it.

Essentially, the match inside the parentheses ([a-zA-Z]\\s) matches a single letter and a space. The value of this match can be referenced with $1 since it is of capturing group 1.

So we replace &(a ) with $d(a ) (brackets here to demonstrate what is captured). Credit to u/rzwitserloot for reminding me that OP wants $ not &.

CodePudding user response:

You presumably want a concept called look-ahead: You can match on things being there without 'consuming' it. You can even match on things NOT being there. That's what you want here: Match &[a-z], but only if looking ahead past that, we do NOT see another letter:

for (String test : List.of("Two literals are &a and &bcd", "A literal is &a", "How about &a?")) {
  System.out.println(str.replaceAll("&(?=[a-zA-Z](?![a-zA-Z]))", "\\$d"));
}

Perhaps instead you want the single letter thing to just be on any word break (i.e. &z00 should NOT turn into $dz00, even though there is no letter after the z. Then I suggest:

"&(?=[a-zA-Z]\\b)"

That's a lot simpler to read!

A few notes:

  • (?=x) is 'positive lookahead'. It doesn't itself match anything but makes the match fail if x is not immediately following the match.
  • (?!x) is 'negative lookahead'. It doesn't itself match anything but makes the match fail if x is immediately following the match.
  • $ has special meaning in the replacement part so we need to escape it.
  • \\b is regexpese for 'word break': Doesn't match any characters, but fails if we aren't on a 'word break'. Spaces, dots, end-of-input, end-of-line, a dash, an ampersand - many things are word breaks.
  • We don't want to match those letters because if we do, they would be replaced.
  • Related