Home > Software engineering >  Regex::split confusing
Regex::split confusing

Time:07-01

I can’t understand why it’s not working. Seen docs and similar questions, but no effect. split method returns empty string literals instead of values (captures_iter returns correct values packed in struct enum)

Sorry for the bad formulated question. I wanted to split a string into subs. As people mentioned, I tried to use a wrong method expecting it performs what it doesn’t.

fn sep(s: &str, c: usize) {
    let re = Regex::new(format!(r#"\d{{{}}}"#, c).as_str()).unwrap();  
    let chunks: Vec<_> = re.split(s).collect();
    println!("{:?}", re); // \d{2}
    println!("{:?}", chunks); // ["", "", ""]
} 
sep("1234", 2);

CodePudding user response:

According to the documentation, .split does not what you think it does.

Returns an iterator of substrings of text delimited by a match of the regular expression. Namely, each element of the iterator corresponds to text that isn’t matched by the regular expression.

The things the regular expression matches are used as delimiter.

So in your case, the string "1234" gets split into:

  • ""
  • delimiter "12"
  • ""
  • delimiter "34"
  • ""

As mentioned, the delimiters get stripped out and you end up with ["", "", ""].

What you probably attempted to do was:

use regex::Regex;

fn sep(s: &str, c: usize) {
    let re = Regex::new(format!(r#"\d{{{}}}"#, c).as_str()).unwrap();
    let chunks: Vec<_> = re
        .captures_iter(s)
        .map(|capture| capture.get(0).unwrap().as_str())
        .collect();

    println!("{:?}", chunks);
}

fn main() {
    sep("1234", 2);
    sep("1234", 3);
    sep("1234", 1);
}
["12", "34"]
["123"]
["1", "2", "3", "4"]
  • Related