Home > Enterprise >  Is converting between String & Vec<u8> a zero-op in --release binary
Is converting between String & Vec<u8> a zero-op in --release binary

Time:05-24

String & Vec<u8> are almost the same to me, though String guarantees to have valid UTF-8 content, which is often useful.

However, being in unsafe context, does it really take any machine operation to cast between two of them if no check is performed?

Consider these two functions:

  • pub unsafe fn from_utf8_unchecked(bytes: Vec<u8, Global>) -> String
  • pub fn into_bytes(self) -> Vec<u8, Global>

They're both consuming input, so a compiler has theoretically no need to render a new object in memory.

CodePudding user response:

With the unsafe version of the function it is a no-op. As you can see here, the assembly for converting a string into/from a vec without checks is the same as the identity function on a vec. This does not mean you should just use the unsafe function for performance, you should use the unsafe function if by profiling you determine the performance is necessary and you can guarantee that the vector you give to the function will always contain vaid UTF-8.

CodePudding user response:

You can take a look at the source code to check that out:

into_bytes():

pub fn into_bytes(self) -> Vec<u8> {
    self.vec
}

from_utf8_unchecked():

pub unsafe fn from_utf8_unchecked(bytes: Vec<u8>) -> String {
    String { vec: bytes }
}

So yes.

However, I agree with @IanS that you should not use unsafe unless profiled.

  • Related