Is it ever safe to use a reference in place of a pointer with a function called across an FFI bounda-CodePudding

I've gathered that references are not the same as pointers, and in fact they have some pretty important distinctions, such as alignment guarantees, null-ness guarantees, and aliasing guarantees, and as such, if you have a function intended to be used across an FFI boundary that takes a raw pointer, you should typically use a raw pointer in the signature to avoid potential issues.

However, are there any situations in which it would be acceptable to use a reference (or something like Option<&T>) in such a function signature? For example, if you are binding an API with a well defined spec that says something like "*T must be valid or it's undefined behavior", could you still run into issues with a reference in place of a pointer in a signature?

CodePudding user response：

It is always safe to use plain references instead of pointers, as long as the pointer behaves as the reference would. As the nomicon says:

References can safely be assumed to be non-nullable pointers directly to the type. However, breaking the borrow checking or mutability rules is not guaranteed to be safe, so prefer using raw pointers (*) if that's needed because the compiler can't make as many assumptions about them.

Also, as the reference says:

Pointers and references have the same layout.

Problems will arise if you have a shared reference &T and the function behind FFI does not treat this as a const *, that is, mutate the T including anything that aliases with &T (e.g. a field in a U that can be reached via &T). That is because

If you have a reference &T, then normally in Rust the compiler performs optimizations based on the knowledge that &T points to immutable data. Mutating that data, for example through an alias or by transmuting an &T into an &mut T, is considered undefined behavior.

That is, the compiler will always assume by the mere existence of &T that T can't change (not accounting for UnsafeCell). In practical terms, the compiler will assume that the FFI-function will not change T if a &T exists as an argument. As advised by the nomicon, it's good practice to prefer *const T and *mut T over mere references, because the compiler can reason about where that raw pointer came from (e.g. it will catch if you try to derive a *mut T from a &T). It is therefore also strongly advised that your FFI definitions are const-correct; e.g. a function that pretends to take const * but actually modifies that argument could be called with &T or a *const T without the compiler being able to catch this.

When it comes to individual types, the interoperability is up to the type itself. The nomicon has a whole section about this. In general:

Rust guarantees that the layout of a struct is compatible with the platform's representation in C only if the #[repr(C)] attribute is applied to it.

This only gives you the guarantee that the type layout as one would write it in C will be the same as it is in Rust, though. You can't generally assume that the Rust-type will behave as it would in C. However, in the specific example of Option or Box, the Rust implementation in std does provide such guarantees; also see here and here (Box<T> is guaranteed to be represented as a single pointer and is also ABI-compatible with C pointers (i.e. the C type T*).