I've gathered that references are not the same as pointers, and in fact they have some pretty important distinctions, such as alignment guarantees, null-ness guarantees, and aliasing guarantees, and as such, if you have a function intended to be used across an FFI boundary that takes a raw pointer, you should typically use a raw pointer in the signature to avoid potential issues.
However, are there any situations in which it would be acceptable to use a reference (or something like Option<&T>
) in such a function signature? For example, if you are binding an API with a well defined spec that says something like "*T
must be valid or it's undefined behavior", could you still run into issues with a reference in place of a pointer in a signature?
CodePudding user response:
It is always safe to use plain references instead of pointers, as long as the pointer behaves as the reference would. As the nomicon says:
References can safely be assumed to be non-nullable pointers directly to the type. However, breaking the borrow checking or mutability rules is not guaranteed to be safe, so prefer using raw pointers (*) if that's needed because the compiler can't make as many assumptions about them.
Also, as the reference says:
Pointers and references have the same layout.
Problems will arise if you have a shared reference &T
and the function behind FFI does not treat this as a const *
, that is, mutate the T
including anything that aliases with &T
(e.g. a field in a U
that can be reached via &T
). That is because
If you have a reference &T, then normally in Rust the compiler performs optimizations based on the knowledge that &T points to immutable data. Mutating that data, for example through an alias or by transmuting an &T into an &mut T, is considered undefined behavior.
That is, the compiler will always assume by the mere existence of &T
that T
can't change (not accounting for UnsafeCell
). In practical terms, the compiler will assume that the FFI-function will not change T
if a &T
exists as an argument. As advised by the nomicon, it's good practice to prefer *const T
and *mut T
over mere references, because the compiler can reason about where that raw pointer came from (e.g. it will catch if you try to derive a *mut T
from a &T
). It is therefore also strongly advised that your FFI definitions are const-correct; e.g. a function that pretends to take const *
but actually modifies that argument could be called with &T
or a *const T
without the compiler being able to catch this.
When it comes to individual types, the interoperability is up to the type itself. The nomicon has a whole section about this. In general:
Rust guarantees that the layout of a struct is compatible with the platform's representation in C only if the #[repr(C)] attribute is applied to it.
This only gives you the guarantee that the type layout as one would write it in C will be the same as it is in Rust, though. You can't generally assume that the Rust-type will behave as it would in C. However, in the specific example of Option
or Box
, the Rust implementation in std
does provide such guarantees; also see here and here (Box<T> is guaranteed to be represented as a single pointer and is also ABI-compatible with C pointers (i.e. the C type T*
).