Managing Memory in Rust

In the first part of this series, we mentioned how it has a different memory model than Haskell. The suggestion was that Rust allows more control over memory usage, like C++. In C++, we explicitly allocate memory on the heap with new and de-allocate it with delete. In Rust, we do allocate memory and de-allocate memory at specific points in our program. Thus it doesn't have garbage collection, as Haskell does. But it doesn't work quite the same way as C++.

In this part, we'll discuss the notion of ownership. This is the main concept governing Rust's memory model. Heap memory always has one owner, and once that owner goes out of scope, the memory gets de-allocated. We'll see how this works; if anything, it's a bit easier than C++!

For a more detailed look at getting started with Rust, take a look at our Rust video tutorial!

Scope (with Primitives)

Before we get into ownership, there are a couple ideas we want to understand. First, let's go over the idea of scope. If you code in Java, C, or C++, this should be familiar. We declare variables within a certain scope, like a for-loop or a function definition. When that block of code ends, the variable is out of scope. We can no longer access it.

int main() {
  for (int i = 0; i < 10; ++i) {
    int j = 0;

    // Do something with j...
  }

  // This doesn't work! j is out of scope!
  std::cout << j << std::endl;
}

Rust works the same way. When we declare a variable within a block, we cannot access it after the block ends. (In a language like Python, this is actually not the case!)

fn main() {

  let j: i32 = {
    let i = 14;
    i + 5
  };

  // i is out of scope. We can't use it anymore.

  println!("{}", j);
}

Another important thing to understand about primitive types is that we can copy them. Since they have a fixed size, and live on the stack, copying should be inexpensive. Consider:

fn main() {

  let mut i: i32 = 10;
  let j = i;
  i = 15;

  // Prints 15, 10
  println!("{}, {}", i, j);
}

The j variable is a full copy. Changing the value of i doesn't change the value of j. Now for the first time, let's talk about a non-primitive type, String.

The String Type

We've dealt with strings a little by using string literals. But string literals don't give us a complete string type. They have a fixed size. So even if we declare them as mutable, we can't do certain operations like append another string. This would change how much memory they use!

let mut my_string = "Hello";
my_string.append(" World!"); // << This doesn't exist for literals!

Instead, we can use the String type. This is a non-primitive object type that will allocate memory on the heap. Here's how we can use it and append to one:

let mut my_string = String::from("Hello");
my_string.push_str(" World!");

Now let's consider some of the implications of scope with object types.

Scope with Strings

At a basic level, some of the same rules apply. If we declare a string within a block, we cannot access it after that block ends.

fn main() {

  let str_length = {
    let s = String::from("Hello");
    s.len()
  }; // s goes out of scope here

  // Fails!
  println!("{}", s);
}

What's cool is that once our string does go out of scope, Rust handles cleaning up the heap memory for it! We don't need to call delete as we would in C++. We define memory cleanup for an object by declaring the drop function. We'll get into more details with this in a later article.

C++ doesn't automatically de-allocate for us! In this example, we must delete myObject at the end of the for loop block. We can't de-allocate it after, so it will leak memory!

int main() {
  for (int i = 0; i < 10; ++i) {
    // Allocate myObject
    MyType* myObject = new MyType(i);

    // Do something with myObject …

    // We MUST delete myObject here or it will leak memory!
    delete myObject; 
  }

  // Can't delete myObject here!
}

So it's neat that Rust handles deletion for us. But there are some interesting implications of this.

Copying Strings

What happens when we try to copy a string?

let len = {
  let s1 = String::from("Hello");
  let s2 = s1;
  s2.len()
};

This first version works fine. But we have to think about what will happen in this case:

let len = {
  let mut s1 = String::from("123");
  let mut s2 = s1;
  s1.push_str("456");
  s1.len() + s2.len()
};

For people coming from C++ or Java, there seem to be two possibilities. If copying into s2 is a shallow copy, we would expect the sum length to be 12. If it's a deep copy, the sum should be 9.

But this code won't compile at all in Rust! The reason is ownership.

Ownership

Deep copies are often much more expensive than the programmer intends. So a performance-oriented language like Rust avoids using deep copying by default. But let's think about what will happen if the example above is a simple shallow copy. When s1 and s2 go out of scope, Rust will call drop on both of them. And they will free the same memory! This kind of "double delete" is a big problem that can crash your program and cause security problems.

In Rust, here's what would happen with the above code. Using let s2 = s1 will do a shallow copy. So s2 will point to the same heap memory. But at the same time, it will invalidate the s1 variable. Thus when we try to push values to s1, we'll be using an invalid reference. This causes the compiler error.

At first, s1 "owns" the heap memory. So when s1 goes out of scope, it will free the memory. But declaring s2 gives over ownership of that memory to the s2 reference. So s1 is now invalid. Memory can only have one owner. This is the main idea to get familiar with.

Here's an important implication of this. In general, passing variables to a function gives up ownership. In this example, after we pass s1 over to add_to_len, we can no longer use it.

fn main() {
  let s1 = String::from("Hello");
  let length = add_to_length(s1);

  // This is invalid! s1 is now out of scope!
  println!("{}", s1);
}

// After this function, drop is called on s
// This deallocates the memory!
fn add_to_length(s: String) -> i32 {
  5 + s.len()
}

This seems like it would be problematic. Won't we want to call different functions with the same variable as an argument? We could work around this by giving back the reference through the return value. This requires the function to return a tuple.

fn main() {
  let s1 = String::from("Hello");
  let (length, s2) = add_to_length(s1);

  // Works
  println!("{}", s2);
}

fn add_to_length(s: String) -> (i32, String) {
  (5 + s.len(), s)
}

But this is cumbersome. There's a better way.

Borrowing References

Like in C++, we can pass a variable by reference. We use the ampersand operator (&) for this. It allows another function to "borrow" ownership, rather than "taking" ownership. When it's done, the original reference will still be valid. In this example, the s1 variable re-takes ownership of the memory after the function call ends.

fn main() {
  let s1 = String::from("Hello");
  let length = add_to_length(&s1);

  // Works
  println!("{}", s1);
}

fn add_to_length(s: &String) -> i32 {
  5 + s.len()
}

This works like a const reference in C++. If you want a mutable reference, you can do this as well. The original variable must be mutable, and then you specify mut in the type signature.

fn main() {
  let mut s1 = String::from("Hello");
  let length = add_to_length(&mut s1);

  // Prints "Hello World!"
  println!("{}", s1);
}

fn add_to_length(s: &mut String) -> i32 {
  s.push_str(", World!");
  5 + s.len()
}

There's one big catch though! You can only have a single mutable reference to a variable at a time! Otherwise your code won't compile! This helps prevent a large category of bugs!

As a final note, if you want to do a true deep copy of an object, you should use the clone function.

fn main() {
  let s1 = String::from("Hello");
  let s2 = s1.clone();

  // Works!
  println!("{}", s1);
  println!("{}", s2);
}

Notes On Slices

We can wrap up with a couple thoughts on slices. Slices give us an immutable, fixed-size reference to a continuous part of an array. Often, we can use the string literal type str as a slice of an object String. Slices are either primitive data, stored on the stack, or they refer to another object. This means they do not have ownership and thus do not de-allocate memory when they go out of scope.

What's Next?

Hopefully this gives you a better understanding of how memory works in Rust! You're now ready for part 3, where we'll start digging into how we can define our own types. We'll start seeing some more ways that Rust acts like Haskell!