My ultimate goal is to parse the prefix number of a &str
if there is one. So I want a function that given "123abc345"
will give me a pair (u32, &str)
which is (123, "abc345")
.
My idea is that if I have a Pattern
type I should be able to do something like
/// `None` if there is no prefix in `s` that matches `p`,
/// Otherwise a pair of the longest matching prefix and the rest
/// of the string `s`.
fn split_prefix<P:Pattern<'a>(s: &'a str, p: P) -> Option<(&'a str, &'a str)>;
My goal would be achieved by doing something like
let num = if let Some((num_s, rest)) = split_prefix(s, char::is_digit) {
s = rest;
num_s.parse()
}
What's the best way to get that?
CodePudding user response:
I looked at the source for str::split_once
and modified slightly to inclusively return a greedily matched prefix.
#![feature(pattern)]
use std::str::pattern::{Pattern, Searcher};
/// See source code for `std::str::split_once`
fn split_prefix<'a, P: Pattern<'a>>(s: &'a str, p: P) -> Option<(&'a str, &'a str)> {
let (start, _) = p.into_searcher(s).next_reject()?;
// `start` here is the start of the unmatched (rejected) substring, so that is our sole delimiting index
unsafe { Some((s.get_unchecked(..start), s.get_unchecked(start..))) }
// If constrained to strictly safe rust code, an alternative is:
// s.get(..start).zip(s.get(start..))
}
This generic prefix splitter could then be wrapped in a specialized function to parse out numerical prefixes:
fn parse_numeric_prefix<'a>(s: &'a str) -> Option<(u32, &'a str)> {
split_prefix(s, char::is_numeric)
.map(|(num_s, rest)| num_s.parse().ok().zip(Some(rest)))
.flatten()
}
UPDATE:
I just re-read your question and realized you want a None
when there is no prefix match. Updated functions:
fn split_prefix<'a, P: Pattern<'a>>(s: &'a str, p: P) -> Option<(&'a str, &'a str)> {
let (start, _) = p.into_searcher(s).next_reject()?;
if start == 0 {
None
} else {
unsafe { Some((s.get_unchecked(..start), s.get_unchecked(start..))) }
}
}
fn parse_numeric_prefix<'a>(s: &'a str) -> Option<(u32, &'a str)> {
split_prefix(s, char::is_numeric)
// We can unwrap the bare `Result` now since we know there's a
// matched numeric which will parse
.map(|(num_s, rest)| (num_s.parse().unwrap(), rest))
}