Let us say we have this C code:
typedef struct A { int i; } A;
typedef struct B { A* a; } B;
typedef struct C { A* a; } C;
int main(void)
{
A a = { .i = 42 };
B b = { .a = &a };
C c = { .a = &a };
}
In this scenario A is stack allocated, B and C point to the stack allocated memory where A lives.
I need to do exactly the same thing in rust but every time I try to create mutliple mutable references it complaints.
It's a little frustrating having to fight the language to accomplish something so basic.
CodePudding user response:
It's a little frustrating having to fight the language to accomplish something so basic.
It's not as basic as you think. Rust's main premise is to have zero undefined behaviour, and it's almost impossible to have two mutable references simultaneously while upholding that guarantee. How would you make sure that through the means of multithreading you don't accidentally get a race condition? This already is undefined behaviour that might be exploitable for malicious means.
Learning Rust is not easy, and it is especially hard if you come from a different language, as many programming paradigms simply don't work in Rust. But I can assure you that once you understand how to structure code differently, it will actually become a positive thing, because Rust forces programmers to distance themselves from questionable patterns, or patterns that seem fine but need a second glance to understand what is actually wrong with them. C/C bugs are usually very subtle and caused by some weird cornercase, and after programming in Rust for a while, it is incredibly rewarding to have the assurance that those corner cases simply don't exist.
But back to your problem.
There are two language concepts here that need to be combined to achieve what you are trying to do.
For once, the borrow checker forces you to have only one mutable reference to a specific piece data at once. That means, if you definitely want to modify it from multiple places, you will have to utilize a concept called interior mutability. Depending on your usecase, there are several ways to create interior mutability:
Cell
- single-threaded, for primitive types that can be replaced by being copied. This is a zero-cost abstraction.RefCell
- single-threaded, for more complex types that require a mutable reference instead of being updatable by replacement. Minimal overhead to check if it is already borrowed.Atomic
- multi-threaded, for primitive types. In most cases zero-cost abstractions (on x86-64 everything up to u64/i64 is already atomic out of the box, zero overhead required)Mutex
- likeRefCell
, but for multiple threads. Larger overhead due to active internal lock management.
So depending on your usecase, you need to choose the right one. In your case, if your data is really an int
, I'd go with a Cell
or an Atomic
.
Second, there is the problem of how to get multiple (immutable) references to your object in the first place.
Right away, I would like to tell you: Do not use raw pointers prematurely. Raw pointers and unsafe
bypass the borrow checker and make Rust as a language pointless. 99.9% of problems work great and performant without using raw pointers, so only use them in circumstances where absolutely no alternative exists.
That said, there are three general ways to share data:
&A
- Normal reference. While the reference exists, the referenced object cannot be moved or deleted. So this is probably not what you want.Rc<A>
- Single threaded reference counter. Very lightweight, so don't worry about overhead. Accessing the data is a zero-cost abstraction, additional cost only arises when you copy/delete the actualRc
object. Moving theRc
object should theoretically be free as this doesn't change the reference count.Arc<A>
- Multi threaded reference counter. LikeRc
, the actual access is zero-cost, but the cost of copying/deleting theArc
object itself is minimally higher thanRc
. Moving theArc
object should theoretically be free as this doesn't change the reference count.
So assuming that you have a single threaded program and the problem is exactly as you layed it out, I'd do:
use std::{cell::Cell, rc::Rc};
struct A {
i: Cell<i32>,
}
struct B {
a: Rc<A>,
}
struct C {
a: Rc<A>,
}
fn main() {
let a = Rc::new(A { i: Cell::new(42) });
let b = B { a: Rc::clone(&a) };
let c = C { a: Rc::clone(&a) };
b.a.i.set(69);
c.a.i.set(c.a.i.get() 2);
println!("{}", a.i.get());
}
71
But of course all the other combinations, like Rc
Atomic
, Arc
Atomic
, Arc
Mutex
etc are also viable. It depends on your usecase.
If your b
and c
objects provably live shorter than a
(meaning, if they only exist for a couple of lines of code and don't get moved anywhere else) then of course use a reference instead of Rc
. The biggest performance difference between Rc
and a direct reference is that the object inside of an Rc
lives on the heap, not on the stack, so it's the equivalent of calling new
/delete
once in C .
So, for reference, if your data sharing allows the object to live on the stack, like in our example, then the code would look like this:
use std::cell::Cell;
struct A {
i: Cell<i32>,
}
struct B<'a> {
a: &'a A,
}
struct C<'a> {
a: &'a A,
}
fn main() {
let a = A { i: Cell::new(42) };
let b = B { a: &a };
let c = C { a: &a };
b.a.i.set(69);
c.a.i.set(c.a.i.get() 2);
println!("{}", a.i.get());
}
71
Note that the reference is zero-cost and Cell
is also zero-cost, so this code will perform 100% identical to as if you had used raw pointers; with the difference that the borrow checker can now prove that this will not cause undefined behaviour.
To demonstrate just how zero-cost this is, look at the assembly output of the example above. The compiler managed to optimize the entire code to:
fn main() {
println!("{}", 71);
}
Be aware that in your C example, nothing would prevent you from copying the b
object somewhere else while a
gets out of scope and gets destroyed. This would cause undefined behavior and will get prevented by the borrow checker in Rust, which is the reason the structs B
and C
carry the lifetime 'a
to track the fact that they borrow an A
.
Lastly, I would like to talk about unsafe
code. Yes, if you interface with C
or write low-level drivers that require direct memory access, unsafe
is absolutely important to have. That said, it's important that you understand how to deal with unsafe
in order to still keep up Rusts safety guarantees. Otherwise, there really isn't any point in using Rust. Don't just use unsafe
out of convenience to simply overrule the borrow checker, but instead make sure that the resulting unsafe
usage is sound. Please read this article about soundness before you use the unsafe
keyword.
I hope this managed to get you a glimpse of what kind of thinking is required for programming in Rust, and hope it did not intimidate you too much. Give it a chance; while having a fairly steep learning curve, especially for programmers with strong prior knowledge in other languages, it can be very rewarding.
CodePudding user response:
This may not apply to your specific situation as indicated in the comments, but the general way to solve this in Rust is with a RefCell
. This type allows you to obtain a &mut T
from a &RefCell<T>
. For example:
use std::cell::RefCell;
struct A(pub RefCell<i32>);
struct B<'a>(pub &'a RefCell<i32>);
fn main() {
let a = A(RefCell::new(0));
let b = B(&a.0);
let c = B(&a.0);
*b.0.borrow_mut() = 1;
println!("{}", c.0.borrow());
*c.0.borrow_mut() = 2;
println!("{}", b.0.borrow());
}
Note that RefCell
has overhead in the form of a borrow count, required to enforce Rust's aliasing rules at runtime (it will panic if a mutable borrow exists concurrently with any other borrow).
If the underlying type is Copy
and you don't need references to the inner value then you can use Cell
, which does not have any runtime overhead, as each operation fully retrieves or replaces the contained value:
use std::cell::Cell;
struct A(pub Cell<i32>);
struct B<'a>(pub &'a Cell<i32>);
fn main() {
let a = A(Cell::new(0));
let b = B(&a.0);
let c = B(&a.0);
b.0.set(1);
println!("{}", c.0.get());
c.0.set(2);
println!("{}", b.0.get());
}
Note that Cell
is #[repr(transparent)]
which is particularly useful in systems programming as it permits some types of zero-cost transmuting between different types.