Home > Blockchain >  How to partition a string into two groups using a regex?
How to partition a string into two groups using a regex?

Time:02-05

I'd like to partition a string into two groups by providing the regex for only one group in Rust.

The regex for the opposite group is not known. I only know the regex for the separator.

For example, with the regex \d and the following string

123abcdef456ghj789

I'd like to obtain both these two strings

abcdefghj

and

123456789

Using the regex and itertools crates, I'm able to get the first group like this

let text = "123abcdef456ghj789";

let re = Regex::new(r"\d ").unwrap();

let text1 = re.split(text).join(""); //abcdefghj

How can I get the second group?

CodePudding user response:

You can get the desired result very similarly:

re.find_iter(text).map(|m| m.as_str()).join("");

.find_iter() returns all matches as an iterator, which you can then call .as_str() on get the full matched text. And then of course use .join() from itertools as you've done before.

Full example on the playground.


It would be nice though if there was a single method that returned a tuple of the disjoined partitions.

It would be nice and certainly possible since the matches return all the information needed to slice-and-dice the text in one pass. Here's my attempt that iteratively calls .find_at():

fn partition_regex(re: &Regex, text: &str) -> (String, String) {
    let mut a = String::new();
    let mut b = String::new();

    let mut search_idx = 0;
    while let Some(m) = re.find_at(text, search_idx) {
        a.push_str(m.as_str());
        b.push_str(&text[search_idx..m.start()]);
        search_idx = m.end();
    }
    b.push_str(&text[search_idx..]);
    
    (a, b)
}

Full example on the playground.

CodePudding user response:

You can use partition to create two sets based on a predicate.

let re = Regex::new(r"(^[a-z] )").unwrap();

let (matches, non_matches): (String, String) 
    = content.lines().partition(|x| re.is_match(x));
  • Related