Home > Software design >  Rust: Idiomatic way to save &str to bytes and then covert it back
Rust: Idiomatic way to save &str to bytes and then covert it back

Time:10-28

I would like to save a file name (at most 32 bytes) in a byte array, and then convert bytes back to String. Since there are a sequence of file names, the underlying array is designed to be fixed size (i.e, 32 bytes).

// name is the file name `&str`
let mut arr = [0u8; 32];
arr[..name.len()].copy_from_slice(name.as_bytes());

But the problem is: it is possible to get the file name the 32-byte long array (arr) without storing the length?

In C/C , many built-in functions are offered due to the fact that the raw string is terminated with 0:

// store
memcpy(arr, name.c_str(), name.length()   1);
// convert it back
char *raw_name = reinterpret_cast<char*>(arr);

So, what is the idiomatic way to do it in Rust? A possible way is to explicitly store the size using an extra 5 bits, but it seems that it is not the best method.

CodePudding user response:

Another way to convert a null-terminated [0u8; 32] to a &str would be through CStr, with the unstable from_bytes_until_nul:

#![feature(cstr_from_bytes_until_nul)]
CStr::from_bytes_until_nul(arr.as_slice()).unwrap().to_str().unwrap()

Playground

However, this requires that a 0 byte is included in the slice, the maximum string length storable becomes 31 bytes. (Try reducing the array length to 5 in the playground to see what happens for 32 bytes long strings.)

If you want to be able to store 32 byte strings, you could use std::str::from_utf8(arr.split(|&c| c == 0).next().unwrap()).unwrap(), but I don't think that qualifies as idiomatic anymore…

CodePudding user response:

I' dont know what reinterpret_cast<char*> exactly do in C. But I think you can do similar thing with std::str::from_utf8

    let name ="hello";
    let mut arr = [0u8; 32];
    arr[..name.len()].copy_from_slice(name.as_bytes());
    println!("{:?}",arr);
    //[104, 101, 108, 108, 111, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    let recoverd_str = std::str::from_utf8(arr.as_slice()).unwrap();
    println!("{:?}",recoverd_str);
    //"hello\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"
    println!("{}",recoverd_str);
    //hello

However, recovered_str and name is not actually same... but you can trim the trailing null bytes! Check this answer.

  • Related