Home > front end >  With Rust Regular Expressions, how can I use named capture groups preceding a string?
With Rust Regular Expressions, how can I use named capture groups preceding a string?

Time:12-02

I'm using the sd tool which uses rust regular expressions, and I am trying to use it with a named capture group, however if in the replacement the named capture group precedes a string, then the string is included as the potential the for named capture group, causing unexpected behaviour.

Here is a contrived example to illustrate the issue:

echo 'abc' | sd -p '(?P<cg>b)' '$cgB'
# outputs: ac
# desired output: abBc

echo 'abc' | sd -p '(?P<cg>b)' '$cg B'
# outputs as expected: ab Bc
# however, places a space there

I've tried $<cg>B, $cg(B), $cg\0B, all don't give abBc.

I've also checked the rust regex docs however the x flag, and other techniques seem only applicable to the search pattern, not the replace pattern.

CodePudding user response:

We don't need the sd tool to reproduce this behavior. Here it is in pure Rust:

let re = regex::Regex::new(r"(?P<n>b)").unwrap();
let before = "abc";
assert_eq!(re.replace_all(before, "$nB"), "ac");
assert_eq!(re.replace_all(before, "${n}B"), "abBc");

The brace replacement syntax isn't described in the front documentation but on the documentation of the replace method:

The longest possible name is used. e.g., $1a looks up the capture group named 1a and not the capture group at index 1. To exert more precise control over the name, use braces, e.g., ${1}a.

In short, unless there's a character that can't be part of a name just after in the replacement pattern, you should always put the group name between braces.

CodePudding user response:

For the rust regexes via sd via bash use case in the question:

echo 'abc' | sd -p '(?P<cg>b)' '${cg}B'
  • Related