With Rust Regular Expressions, how can I use named capture groups preceding a string?-CodePudding

I'm using the sd tool which uses rust regular expressions, and I am trying to use it with a named capture group, however if in the replacement the named capture group precedes a string, then the string is included as the potential the for named capture group, causing unexpected behaviour.

Here is a contrived example to illustrate the issue:

echo 'abc' | sd -p '(?P<cg>b)' '$cgB'
# outputs: ac
# desired output: abBc

echo 'abc' | sd -p '(?P<cg>b)' '$cg B'
# outputs as expected: ab Bc
# however, places a space there

I've tried $<cg>B, $cg(B), $cg\0B, all don't give abBc.

I've also checked the rust regex docs however the x flag, and other techniques seem only applicable to the search pattern, not the replace pattern.

CodePudding user response：

We don't need the sd tool to reproduce this behavior. Here it is in pure Rust:

let re = regex::Regex::new(r"(?P<n>b)").unwrap();
let before = "abc";
assert_eq!(re.replace_all(before, "$nB"), "ac");
assert_eq!(re.replace_all(before, "${n}B"), "abBc");

The brace replacement syntax isn't described in the front documentation but on the documentation of the replace method:

The longest possible name is used. e.g., $1a looks up the capture group named 1a and not the capture group at index 1. To exert more precise control over the name, use braces, e.g., ${1}a.

In short, unless there's a character that can't be part of a name just after in the replacement pattern, you should always put the group name between braces.

CodePudding user response：

For the rust regexes via sd via bash use case in the question:

echo 'abc' | sd -p '(?P<cg>b)' '${cg}B'