Home > OS >  How to concisely split string into 3 words?
How to concisely split string into 3 words?

Time:10-05

Disclaimer 1: I am a complete beginner in Rust.

Disclaimer 2: this is a question mostly about codestyle. Call it "idiomatic" :)

I have a String. It should contain 3 words delimited by a space, and I need these 3 words. My question is: how to do it concisely?

Variant 1: the simplest

let line: String = // ...
let line_split = line.split(" ").collect::<Vec<&str>>();
let word1 = line_split[0];
let word2 = line_split[1];
let word3 = line_split[2];
// or
let line_split = line.split(" ");
let word1 = line_split.next().unwrap();
let word2 = line_split.next().unwrap();
let word3 = line_split.next().unwrap();

This is too long, repetitive, and error-prone.

Variant 2: doesn't compile

let line: String = // ...
let [word1, word2, word3] = line.split(" ").collect::<Vec<&str>>()[..];

This is the syntax that I am looking for, but Rust compiler forces me to check all patterns [], [_], ...

Variant 2.1: compiles

let line: String = // ...
if let [word1, word2, word3] = line.split(" ").collect::<Vec<&str>>()[..] {
    // ...
} else {
    panic!("line doesn't have exactly 3 words");
}

This introduces an extra scope. If I do such if let several times (which I need to), the scope will get too nested for no reason.

Variant 2.2: hiding extra scopes

fn split_3(line: &str, pattern: &str) -> (&str, &str, &str) {
    if let [word1, word2, word3] = line.split(pattern).collect::<Vec<&str>>()[..] {
        return (word1, word2, word3);
    } else {
        panic!("line doesn't have exactly 3 words");
    }
}

Now the problem is to split the line into 2 words (which I need to as well), I have to declare fn split_2(...) which is basically copypaste. And I can't pass 3 or 2 as a parameter, because the whole point is to have the tuple size at compile-time.

So, is there any better solution? If not, which one of these I should use?

CodePudding user response:

fn main() {
    let line: String = String::from("ab cd ef");

    let [word1, word2, word3]: [&str; 3] =
        line.split(" ").collect::<Vec<&str>>().try_into().unwrap();

    println!("{}", word1);
    println!("{}", word2);
    println!("{}", word3);
}

Although this contains the allocation of a new vector.

I'd argue that this is already the most ideal solution:

let line_split = line.split(" ");
let word1 = line_split.next().unwrap();
let word2 = line_split.next().unwrap();
let word3 = line_split.next().unwrap();

I am unsure why you think this is error prone, repetitive, or too long. In my opinion, this is exactly what you want.


Some alternatives:

fn split_twice<'a>(s: &'a str, delim: &str) -> Option<(&'a str, &'a str, &'a str)> {
    let (s1, s) = s.split_once(delim)?;
    let (s2, s3) = s.split_once(delim)?;
    Some((s1, s2, s3))
}

fn main() {
    let line: String = String::from("ab cd ef");

    let (word1, word2, word3) = split_twice(&line, " ").unwrap();

    println!("{}", word1);
    println!("{}", word2);
    println!("{}", word3);
}

Very good error handling, but limited to a fixed number of repeats.

Also good, because no code duplication:

fn main() {
    let line: String = String::from("ab cd ef");

    let mut line_split = line.split(" ");
    let [word1, word2, word3] = [(); 3].map(|()| line_split.next().unwrap());
}

Or the improved version (thanks @PitaJ):

fn main() {
    let line: String = String::from("ab cd ef");

    let mut line_split = line.split(" ");
    let [word1, word2, word3] = std::array::from_fn(|_| line_split.next().unwrap());
}

Or if you don't like the fact that there is now the line_split variable, then move it to a local scope:

fn main() {
    let line: String = String::from("ab cd ef");

    let [word1, word2, word3] = std::array::from_fn({
        let mut line_split = line.split(" ");
        move |_| line_split.next().unwrap()
    });
}

CodePudding user response:

I would suggest using the itertools crate which makes this (and many other things) quite convenient, in this particular case with collect_tuple:

use itertools::Itertools;

fn main() {
    let s = "foo bar baz";
    let (a, b, c) = s.split(" ").collect_tuple().unwrap();
    println!("{a} {b} {c}");
}
  • Related