Home > Enterprise >  Take prefix slice of &str that matches `Pattern` in rust
Take prefix slice of &str that matches `Pattern` in rust

Time:01-21

My ultimate goal is to parse the prefix number of a &str if there is one. So I want a function that given "123abc345" will give me a pair (u32, &str) which is (123, "abc345").

My idea is that if I have a Pattern type I should be able to do something like

/// `None` if there is no prefix in `s` that matches `p`,
/// Otherwise a pair of the longest matching prefix and the rest
/// of the string `s`.
fn split_prefix<P:Pattern<'a>(s: &'a str, p: P) -> Option<(&'a str, &'a str)>;

My goal would be achieved by doing something like

let num = if let Some((num_s, rest)) = split_prefix(s, char::is_digit) {
  s = rest;
  num_s.parse()
}

What's the best way to get that?

CodePudding user response:

I looked at the source for str::split_once and modified slightly to inclusively return a greedily matched prefix.

Playground

#![feature(pattern)]
use std::str::pattern::{Pattern, Searcher};

/// See source code for `std::str::split_once`
fn split_prefix<'a, P: Pattern<'a>>(s: &'a str, p: P) -> Option<(&'a str, &'a str)> {
    let (start, _) = p.into_searcher(s).next_reject()?;
    // `start` here is the start of the unmatched (rejected) substring, so that is our sole delimiting index
    unsafe { Some((s.get_unchecked(..start), s.get_unchecked(start..))) }                                    

    // If constrained to strictly safe rust code, an alternative is:
    // s.get(..start).zip(s.get(start..))
}

This generic prefix splitter could then be wrapped in a specialized function to parse out numerical prefixes:

fn parse_numeric_prefix<'a>(s: &'a str) -> Option<(u32, &'a str)> {                                             
    split_prefix(s, char::is_numeric)
        .map(|(num_s, rest)| num_s.parse().ok().zip(Some(rest)))
        .flatten()
}

UPDATE: I just re-read your question and realized you want a None when there is no prefix match. Updated functions:

Playground

fn split_prefix<'a, P: Pattern<'a>>(s: &'a str, p: P) -> Option<(&'a str, &'a str)> {
    let (start, _) = p.into_searcher(s).next_reject()?;
    if start == 0 {
        None
    } else {
        unsafe { Some((s.get_unchecked(..start), s.get_unchecked(start..))) }
    }
}

fn parse_numeric_prefix<'a>(s: &'a str) -> Option<(u32, &'a str)> {                                             
    split_prefix(s, char::is_numeric)
        // We can unwrap the bare `Result` now since we know there's a
        // matched numeric which will parse
        .map(|(num_s, rest)| (num_s.parse().unwrap(), rest))
}
  • Related