I am completely new to Rust (as in I just started looking at it yesterday), and am working my way through "The Rust Programming Language". I'm a little stuck on Chapters 4.2 (References and Borrowing) / 4.3 (The Slice Type) and am trying to solidify my initial understanding of references before I move on. I'm an experienced programmer whose background is mainly in C (I am intimately familiar with several languages, but C is what I'm most comfortable with).
Consider the following Rust code:
let string_obj: String = String::from("My String");
let string_ref: &String = &string_obj;
let string_slice: &str = &string_obj[1..=5];
Based on my understanding, from the first line, string_obj
is an object of type String
that is stored on the stack, which contains three fields: (1) a pointer to the text "My String", allocated on the heap, encoded in UTF-8; (2) A length field with value 9; (3) A capacity field with a value >= 9. That's straightforward enough.
From the second line, string_ref
is an immutable reference to a String
object, also stored on the stack, which contains a single field - a pointer to string_obj
. This leads me to believe that (leaving aside ownership rules, semantics, and other things I am yet to learn about references), a reference is essentially a pointer to some other object. Again, pretty straightforward.
It's the third line which causing me some headaches. From the documentation, it would appear that string_slice
is an object of type &str
that is stored on the stack, and contains two fields: 1) a pointer to the text "y Str", within the text "My String" associated with string_obj
. 2) A length field with value 5.
But, by appearances at least, the &str
type is by definition an immutable reference to an object of type str
. So my questions are as follows:
- What exactly is an
str
, and how is it represented in memory? - How does
&str
- a reference type, which I thought was simply a pointer - contain TWO fields (a pointer AND a length)? - How does Rust know in general what / how many fields to create when constructing a reference? (and consequently how does the programmer know?)
CodePudding user response:
Slices are primitive types in Rust, which means that they don't necessarily have to follow the syntax rules of other types. In this case, str
and &str
are special and are treated with a bit of magic.
The type str
doesn't really exist, since you can't have a slice that owns its contents. The reason for requiring us to spell this type "&str
" is syntactic: the &
reminds us that we're working with data borrowed from somewhere else, and it's required to be able to specify lifetimes, such as:
fn example<'a>(x: &String, y: &'a String) -> &'a str {
&y[..]
}
It's also necessary so that we can differentiate between an immutably-borrowed string slice (&str
) and a mutably-borrowed string slice (&mut str
). (Though the latter are somewhat limited in their usefulness and so you don't see them that often.)
Note that the same thing applies to array slices. We have arrays like [u8; 16]
and we have slices like &[u8]
but we don't really directly interact with [u8]
. Here the mutable variant (&mut [u8]
) is more useful than with string slices.
What exactly is an
str
, and how is it represented in memory?
As per above, str
kind-of doesn't really exist by itself. The layout of &str
though is as you suspect -- a pointer and a length.
How does
&str
- a reference type, which I thought was simply a pointer - contain TWO fields (a pointer AND a length)?
As a primitive, it's a special case handled by the compiler.
How does Rust know in general what / how many fields to create when constructing a reference? (and consequently how does the programmer know?)
If it's a non-slice reference, then it's either a pointer or it's nothing (if the reference itself can be optimized away).