Home > front end >  Is there a safe alternative to CString::new() which does not require an unwrap?
Is there a safe alternative to CString::new() which does not require an unwrap?

Time:10-03

I have some code in an FFI module which converts Rust strings to CStrings to be sent back to the caller.

I'm finding that the calling conventions for CString::new() to be difficult to use. Specifically:

This function will return an error if the supplied bytes contain an internal 0 byte. The NulError returned will contain the bytes as well as the position of the nul byte.

Basically if you don't want to use unwrap() and want to call this function safely, you'd have to do something like:

let cstr = match CString::new(message) {
    Ok(cstr) => { cstr }
    Err(_) => { CString::new("error converting string!").unwrap() }
};

Theoretically it could be better, because the Err value contains the index of the null byte, I could truncate the string at that point, but this gets to be a bit involved.

Why? Because the first call to CString::new(message) consumes the string, so I'd have to clone the string before making the call just so I could use it again inside the Err arm of the match, which is a code path that should never get called anyway.

I've considered scanning for nulls beforehand, but my knowledge of of Unicode is limited and I'm reluctant to just remove null bytes from a Unicode string.

Basically my question is, is there a better way? I wish that CString::new(message) would just truncate the string at the first null then. It could give you back the truncated CString in the Err value.

Maybe there is another call which is easier to use. Am I missing something?

(Edit: I am only passing ASCII strings to this call, by the way.)

Edit: I went with this solution based on the accepted answer by @kmdreko:

pub unsafe fn log_to_c_callback(level: LogLevel, mut message: String)  {
    if let Some(callback) = LOGGER_CALLBACK {
        message.retain(|c| c != '\0');
        let len = message.len();

        #[allow(clippy::unwrap_used)]
        // Unwrap is safe because we removed all the null bytes above.
        let message_cstr = CString::new(message).unwrap();
        
        callback(level, message_cstr.as_ptr(), len as i32);
    }
}

CodePudding user response:

I've considered scanning for nulls beforehand, but my knowledge of of Unicode is limited and I'm reluctant to just remove null bytes from a Unicode string.

My knowledge of Unicode is less limited. You can safely search-for and remove null characters. Multi-byte characters will not have an all-zero byte(Wikipedia on UTF-8 encoding) and even if they did, Rust chars are Unicode scalar values not simple bytes.

let mut message = String::from("hello w\0rld");
message.retain(|c| c != '\0');
let cstr = CString::new(message).expect("should not have null characters");

You might take this opportunity to filter out other unsavory characters like control characters, newlines, whatever you fancy.


If you really don't want an .unwrap()/.expect(), you can use your original plan but without the cloning. The NulError type also returns the original "string" via .into_vec():

let message = String::from("hello w\0rld");

let mut message = message.into_bytes();
let cstr = loop {
    match CString::new(message) {
        Ok(cstr) => break cstr,
        Err(err) => {
            let idx = err.nul_position();
            message = err.into_vec();
            message.remove(idx);
        }
    }
};
  • Related