Home Articles Tags

Last Updated: 7.May.2023

Tags: #rust, #smart-pointers

Vectors in Rust

It's the second article of the Rust smart pointers series. This time, it's vectors. If you're writting code in Rust, there's no way you can avoid using this type. Knowing the internals of one of the most frequently used type will improve our code a lot!

All the std code in this article is from the lastest Rust repo as I'm writting this article (07-May-2023). The implementation may change over time.

Definition

1#[stable(feature = "rust1", since = "1.0.0")]
2#[cfg_attr(not(test), rustc_diagnostic_item = "Vec")]
3#[rustc_insignificant_dtor]
4pub struct Vec<T, #[unstable(feature = "allocator_api", issue = "32838")] A: Allocator = Global> {
5    buf: RawVec<T, A>,
6    len: usize,
7}

Above is the definition of Vec<T>. Ignoring the macros for debug information, it consists of a buffer and a length. The buffer has type RawVec<T>, which looks like below.

1#[allow(missing_debug_implementations)]
2pub(crate) struct RawVec<T, A: Allocator = Global> {
3    ptr: Unique<T>,
4    cap: usize,
5    alloc: A,
6}

RawVec has 3 fields: pointer to the actual data, capacity of the buffer, and a memory allocator. Unique is a speical type of raw pointer.

A vector is very simple. It has a pointer to a buffer, a length and a capacity. Just like what we saw in Data Structure courses.

Initialization of a vector

What happens when we initialize a vector? The most common way to initialize a vector is using vec!. vec! is defined like below.

1#[cfg(all(not(no_global_oom_handling), not(test)))]
2#[macro_export]
3#[stable(feature = "rust1", since = "1.0.0")]
4#[rustc_diagnostic_item = "vec_macro"]
5#[allow_internal_unstable(rustc_attrs, liballoc_internals)]
6macro_rules! vec {
7    () => (
8        $crate::__rust_force_expr!($crate::vec::Vec::new())
9    );
10    ($elem:expr; $n:expr) => (
11        $crate::__rust_force_expr!($crate::vec::from_elem($elem, $n))
12    );
13    ($($x:expr),+ $(,)?) => (
14        $crate::__rust_force_expr!(<[_]>::into_vec(
15            // This rustc_box is not required, but it produces a dramatic improvement in compile
16            // time when constructing arrays with many elements.
17            #[rustc_box]
18            $crate::boxed::Box::new([$($x),+])
19        ))
20    );
21}

There are 3 cases of a vector initialization. If a programmer wants an empty vector, it calls vec::Vec::new(). It guarantees that it doesn't allocate any memory when an empty vector is initialized. If a programmer initializes vector like vec![x; 100], it calls vec::from_elem. Click the link to see the explanation of the function.

For the last branch, which is the most common way to define a vector, wraps a buffer with Box and calls into_vec.

1// library/alloc/src/slice.rs
2pub fn into_vec<T, A: Allocator>(b: Box<[T], A>) -> Vec<T, A> {
3    unsafe {
4        let len = b.len();
5        let (b, alloc) = Box::into_raw_with_allocator(b);
6        Vec::from_raw_parts_in(b as *mut T, len, len, alloc)
7    }
8}
9
10// library/alloc/src/vec/mod.rs
11impl<T, A: Allocator> Vec<T, A> {
12    #[inline]
13    #[unstable(feature = "allocator_api", issue = "32838")]
14    pub unsafe fn from_raw_parts_in(ptr: *mut T, length: usize, capacity: usize, alloc: A) -> Self {
15        unsafe { Vec { buf: RawVec::from_raw_parts_in(ptr, capacity, alloc), len: length } }
16    }
17}

into_vec extracts information needed for the initialization of a vector, then calls Vec::from_raw_parts_in. An interesting part here is that the capacity and the length are the same. It doesn't allocate additional space for future push operations.

Read an element of a vector

How does it access an element of a vector? It goes through quite complicated steps. Below are some important functions in the steps. The functions below are called from top to bottom.

1// library/alloc/src/vec/mod.rs
2impl<T, I: SliceIndex<[T]>, A: Allocator> Index<I> for Vec<T, A> {
3    type Output = I::Output;
4
5    #[inline]
6    fn index(&self, index: I) -> &Self::Output {
7        Index::index(&**self, index)
8    }
9}
10
11// core/src/slice/index.rs
12#[stable(feature = "slice_get_slice_impls", since = "1.15.0")]
13#[rustc_const_unstable(feature = "const_slice_index", issue = "none")]
14unsafe impl<T> SliceIndex<[T]> for usize {
15    type Output = T;
16
17    #[inline]
18    fn get(self, slice: &[T]) -> Option<&T> {
19        // SAFETY: `self` is checked to be in bounds.
20        if self < slice.len() { unsafe { Some(&*self.get_unchecked(slice)) } } else { None }
21    }
22
23    #[inline]
24    unsafe fn get_unchecked(self, slice: *const [T]) -> *const T {
25        let this = self;
26        // SAFETY: the caller guarantees that `slice` is not dangling, so it
27        // cannot be longer than `isize::MAX`. They also guarantee that
28        // `self` is in bounds of `slice` so `self` cannot overflow an `isize`,
29        // so the call to `add` is safe.
30        unsafe {
31            assert_unsafe_precondition!(
32                "slice::get_unchecked requires that the index is within the slice",
33                [T](this: usize, slice: *const [T]) => this < slice.len()
34            );
35            slice.as_ptr().add(self)
36        }
37    }
38
39}

It first checks whether the index is in bounds. If so, it returns the pointer of the desired element by adding the index to the pointer of the buffer.

Push an element to a vector

Pushing an element is more complex than reading/initializing because it may change the metadata of a vector. Let's see how .push is defined.

1#[cfg(not(no_global_oom_handling))]
2#[inline]
3#[stable(feature = "rust1", since = "1.0.0")]
4pub fn push(&mut self, value: T) {
5    // This will panic or abort if we would allocate > isize::MAX bytes
6    // or if the length increment would overflow for zero-sized types.
7    if self.len == self.buf.capacity() {
8        self.buf.reserve_for_push(self.len);
9    }
10    unsafe {
11        let end = self.as_mut_ptr().add(self.len);
12        ptr::write(end, value);
13        self.len += 1;
14    }
15}

The push method itself is very simple. It checks whether the capacity is available, then writes a new value at the end of the buffer. If there's no enough capacity, it calls .reserve_for_push. It's what we're interested in.

1impl<T, A: Allocator> RawVec<T, A> {
2
3    pub(crate) const MIN_NON_ZERO_CAP: usize = if mem::size_of::<T>() == 1 {
4        8
5    } else if mem::size_of::<T>() <= 1024 {
6        4
7    } else {
8        1
9    };
10
11    /// A specialized version of `reserve()` used only by the hot and
12    /// oft-instantiated `Vec::push()`, which does its own capacity check.
13    #[cfg(not(no_global_oom_handling))]
14    #[inline(never)]
15    pub fn reserve_for_push(&mut self, len: usize) {
16        handle_reserve(self.grow_amortized(len, 1));
17    }
18
19}
20
21impl<T, A: Allocator> RawVec<T, A> {
22
23    fn set_ptr_and_cap(&mut self, ptr: NonNull<[u8]>, cap: usize) {
24        // Allocators currently return a `NonNull<[u8]>` whose length matches
25        // the size requested. If that ever changes, the capacity here should
26        // change to `ptr.len() / mem::size_of::<T>()`.
27        self.ptr = unsafe { Unique::new_unchecked(ptr.cast().as_ptr()) };
28        self.cap = cap;
29    }
30
31    // This method is usually instantiated many times. So we want it to be as
32    // small as possible, to improve compile times. But we also want as much of
33    // its contents to be statically computable as possible, to make the
34    // generated code run faster. Therefore, this method is carefully written
35    // so that all of the code that depends on `T` is within it, while as much
36    // of the code that doesn't depend on `T` as possible is in functions that
37    // are non-generic over `T`.
38    fn grow_amortized(&mut self, len: usize, additional: usize) -> Result<(), TryReserveError> {
39        // This is ensured by the calling contexts.
40        debug_assert!(additional > 0);
41
42        if T::IS_ZST {
43            // Since we return a capacity of `usize::MAX` when `elem_size` is
44            // 0, getting to here necessarily means the `RawVec` is overfull.
45            return Err(CapacityOverflow.into());
46        }
47
48        // Nothing we can really do about these checks, sadly.
49        let required_cap = len.checked_add(additional).ok_or(CapacityOverflow)?;
50
51        // This guarantees exponential growth. The doubling cannot overflow
52        // because `cap <= isize::MAX` and the type of `cap` is `usize`.
53        let cap = cmp::max(self.cap * 2, required_cap);
54        let cap = cmp::max(Self::MIN_NON_ZERO_CAP, cap);
55
56        let new_layout = Layout::array::<T>(cap);
57
58        // `finish_grow` is non-generic over `T`.
59        let ptr = finish_grow(new_layout, self.current_memory(), &mut self.alloc)?;
60        self.set_ptr_and_cap(ptr, cap);
61        Ok(())
62    }
63
64}
65
66#[inline(never)]
67fn finish_grow<A>(
68    new_layout: Result<Layout, LayoutError>,
69    current_memory: Option<(NonNull<u8>, Layout)>,
70    alloc: &mut A,
71) -> Result<NonNull<[u8]>, TryReserveError>
72where
73    A: Allocator,
74{
75    // Check for the error here to minimize the size of `RawVec::grow_*`.
76    let new_layout = new_layout.map_err(|_| CapacityOverflow)?;
77
78    alloc_guard(new_layout.size())?;
79
80    let memory = if let Some((ptr, old_layout)) = current_memory {
81        debug_assert_eq!(old_layout.align(), new_layout.align());
82        unsafe {
83            // The allocator checks for alignment equality
84            intrinsics::assume(old_layout.align() == new_layout.align());
85            alloc.grow(ptr, old_layout, new_layout)
86        }
87    } else {
88        alloc.allocate(new_layout)
89    };
90
91    memory.map_err(|_| AllocError { layout: new_layout, non_exhaustive: () }.into())
92}
93
94// Central function for reserve error handling.
95#[cfg(not(no_global_oom_handling))]
96#[inline]
97fn handle_reserve(result: Result<(), TryReserveError>) {
98    match result.map_err(|e| e.kind()) {
99        Err(CapacityOverflow) => capacity_overflow(),
100        Err(AllocError { layout, .. }) => handle_alloc_error(layout),
101        Ok(()) => { /* yay */ }
102    }
103}

Well, that's quite long. When a buffer doesn't have any more space, it calls reserve_for_push to grow the size of the vector. It first calls grow_amortized, which determines the capacity of the new vector, and allocates the actual memory. handle_reserve checks whether the allocation was successful.

grow_amortized takes 2 arguments: current length and the number of additional elements of the vector. For our push case, additional is 1. It first tries to double the size of the buffer. If the doubled buffer is still not big enough, it makes the buffer size exactly len + additional. If the new capacity is smaller than the minial required capacity, it grows the capacity again. MIN_NON_ZERO_CAP is defined at line 3 above. It's 8 for byte-sized types, 4 for small types and 1 for big ones. finish_grow actually allocates the new memory, and makes sure that the newly allocated memory is valid.

In case you wonder what the line 42 means...

For 0-sized types, it doesn't allocate any space. It sets capacity to be usize::MAX for 0-sized types. Since this function is called only when length >= capacity, line 42 is true only when the programmer tries to push more than usize::MAX elements to a vector. It returns Err(CapacityOverflow) in such case.

For Rust programmers

Above sections contain some useful information for Rust programmers. First, when we initialize a vector using vec!, it doesn't allocate additional space. The length and the capacity are the same. When we push an element, a vector is guaranteed to have a capacity of at least 4 elements (for most types). If the capacity is to grow, it's doubled.

Appendix

vec::Vec::new()

1#[inline]
2#[rustc_const_stable(feature = "const_vec_new", since = "1.39.0")]
3#[stable(feature = "rust1", since = "1.0.0")]
4#[must_use]
5pub const fn new() -> Self {
6    Vec { buf: RawVec::NEW, len: 0 }
7}

Below is the definition of RawVec::NEW.

1impl<T> RawVec<T, Global> {
2    /// HACK(Centril): This exists because stable `const fn` can only call stable `const fn`, so
3    /// they cannot call `Self::new()`.
4    ///
5    /// If you change `RawVec<T>::new` or dependencies, please take care to not introduce anything
6    /// that would truly const-call something unstable.
7    pub const NEW: Self = Self::new();
8
9    /// Creates the biggest possible `RawVec` (on the system heap)
10    /// without allocating. If `T` has positive size, then this makes a
11    /// `RawVec` with capacity `0`. If `T` is zero-sized, then it makes a
12    /// `RawVec` with capacity `usize::MAX`. Useful for implementing
13    /// delayed allocation.
14    #[must_use]
15    pub const fn new() -> Self {
16        Self::new_in(Global)
17    }
18}

As the comment says, it guarantees that Vec::new() doesn't allocate any memory.

vec::from_elem()

1impl<T: Clone> SpecFromElem for T {
2    default fn from_elem<A: Allocator>(elem: Self, n: usize, alloc: A) -> Vec<Self, A> {
3        let mut v = Vec::with_capacity_in(n, alloc);
4        v.extend_with(n, ExtendElement(elem));
5        v
6    }
7}

It calls .extend_with with the element and n. ExtendElement is a generator that endlessly clones elem. That's why T has to satisfy Clone.

1impl<T, A: Allocator> Vec<T, A> {
2    #[cfg(not(no_global_oom_handling))]
3    /// Extend the vector by `n` values, using the given generator.
4    fn extend_with<E: ExtendWith<T>>(&mut self, n: usize, mut value: E) {
5        self.reserve(n);
6
7        unsafe {
8            let mut ptr = self.as_mut_ptr().add(self.len());
9            // Use SetLenOnDrop to work around bug where compiler
10            // might not realize the store through `ptr` through self.set_len()
11            // don't alias.
12            let mut local_len = SetLenOnDrop::new(&mut self.len);
13
14            // Write all elements except the last one
15            for _ in 1..n {
16                ptr::write(ptr, value.next());
17                ptr = ptr.add(1);
18                // Increment the length in every step in case next() panics
19                local_len.increment_len(1);
20            }
21
22            if n > 0 {
23                // We can write the last element directly without cloning needlessly
24                ptr::write(ptr, value.last());
25                local_len.increment_len(1);
26            }
27
28            // len set by scope guard
29        }
30    }
31}

As mentioned above, ExtendWith is an endless generator. .next() at line 16 clones value.

Unique

The explanation below is from the comment in the compiler source code.

1#[unstable(
2    feature = "ptr_internals",
3    issue = "none",
4    reason = "use `NonNull` instead and consider `PhantomData<T>` \
5              (if you also use `#[may_dangle]`), `Send`, and/or `Sync`"
6)]
7#[doc(hidden)]
8#[repr(transparent)]
9pub struct Unique<T: ?Sized> {
10    pointer: NonNull<T>,
11    // NOTE: this marker has no consequences for variance, but is necessary
12    // for dropck to understand that we logically own a `T`.
13    //
14    // For details, see:
15    // https://github.com/rust-lang/rfcs/blob/master/text/0769-sound-generic-drop.md#phantom-data
16    _marker: PhantomData<T>,
17}

A wrapper around a raw non-null *mut T that indicates that the possessor of this wrapper owns the referent. Useful for building abstractions like Box<T>, Vec<T>, String, and HashMap<K, V>.

Unlike *mut T, Unique<T> behaves "as if" it were an instance of T. It implements Send/Sync if T is Send/Sync. It also implies the kind of strong aliasing guarantees an instance of T can expect: the referent of the pointer should not be modified without a unique path to its owning Unique.

If you're uncertain of whether it's correct to use Unique for your purposes, consider using NonNull, which has weaker semantics.

Unlike *mut T, the pointer must always be non-null, even if the pointer is never dereferenced. This is so that enums may use this forbidden value as a discriminant -- Option<Unique<T>> has the same size as Unique<T>. However the pointer may still dangle if it isn't dereferenced.

Unlike *mut T, Unique<T> is covariant over T. This should always be correct for any type which upholds Unique's aliasing requirements.