Collection Types and Lifetimes

In the fourth part of our series, we discussed how to use Cargo to create and manage Rust projects. Now we're going to finish up our discussion of Rust. We'll look at a couple common container types: vectors and hash maps. Rust has some interesting ideas for how to handle common operations, as we'll see. We'll also touch on the topic of lifetimes. This is another concept related to memory management, for trickier scenarios.

For a good summary of the main principles of Rust, take a look at our Rust Video Tutorial. It will give you a concise overview of the topics we've covered in our series and more!

Vectors

Vectors, like lists or arrays in Haskell, contain many elements of the same type. It's important to note that Rust vectors store items in memory like arrays. That is, all their elements are in a contiguous space in memory. We refer to them with the parameterized type Vec in Rust. There are a couple ways to initialize them:

let v1: Vec<i32> = vec![1, 2, 3, 4];
let v2: Vec<u32> = Vec::new();

The first vector uses the vec! macro to initialize. It will have four elements. Then the second will have no elements. Of course, the second vector won't be much use to us! Since it's immutable, it will never contain any elements! Of course, we can change this by making it mutable! We can use simple operations like push and pop to manipulate it.

let mut v2: Vec<u32> = Vec::new();
v2.push(5);
v2.push(6);
v2.push(7);
v2.push(8);
let x = v2.pop();

println!("{:?}", v2);
println!("{} was popped!", x);

Note that pop will remove from the back of the vector. So the printed vector will have 5, 6, and 7. The second line will print 8.

There are a couple different ways of accessing vectors by index. The first way is to use traditional bracket syntax, like we would find in C++. This will throw an error and crash if you are out of bounds!

let v1: Vec<i32> = vec![1, 2, 3, 4];

let first: i32 = v1[0];
let second: i32 = v1[1];

You can also use the get function. This returns the Option type we discussed in a previous part. This allows us to handle the error gracefully instead of crashing. In the example below, we print a failure message, rather than crashing as we would with v1[5].

let v1: Vec<i32> = vec![1, 2, 3, 4];

match v1.get(5) {
  Some(elem) => println!("Found an element: {}!", elem),
  None => println!("No element!"),
}

Another neat trick we can do with vectors is loop through them. This loop will add 2 to each of the integers in our list. It does this by using the * operator to de-reference to element, like in C++. We must pass it as a mutable reference to the vector to update it.

for i in &mut v1 {
  *i += 2;
}

Ownership with Vectors

Everything works fine with the examples above because we're only using primitive numbers. But if we use non-primitive types, we need to remember to apply the rules of ownership. In general, a vector should own its contents. So when we push an element into a vector, we give up ownership. The follow code will not compile because s1 is invalid after pushing!

let s1 = String::from("Hello");
let mut v1: Vec<String> = Vec::new();
v1.push(s1);
println!("{}", s1); // << Doesn't work!

Likewise, we can't get a normal String out of the vector. We can only get a reference to it. This will also cause problems:

let s1 = String::from("Hello");
let mut v1: Vec<String> = Vec::new();
v1.push(s1);

// Bad!
let s2: String = v1[0];

We can fix these easily though, by adding ampersands to make it clear we want a reference:

let s1 = String::from("Hello");
let mut v1: Vec<String> = Vec::new();
v1.push(s1);

let s2: &String = &v1[0];

But if we get an item out of the list, that reference gets invalidated if we then update the list again.

let mut v1: Vec<String> = Vec::new();
v1.push(String::from("Hello"));

let s2: &String = &v1[0];
v1.push(String::from("World"));

// Bad! s2 is invalidated by pushing the v1!
println!("{}", s2);

This can be very confusing. But again, the reason lies with the way memory works. The extra push might re-allocate the entire vector. This would invalidate the memory s2 points to.

Hash Maps

Now let's discuss hash maps. At a basic level, these work much like their Haskell counterparts. They have two type parameters, a key and a value. We can initialize them and insert elements:

let mut phone_numbers: HashMap<String, String> = HashMap::new();
phone_numbers.insert(
    String::from("Johnny"),
    String::from("555-123-4567"));
phone_numbers.insert(
    String::from("Katie"),
    String::from("555-987-6543"));

As with vectors, hash maps take ownership of their elements, both keys and values. We access elements in hash maps with the get function. As with vectors, this returns an Option:

match phone_numbers.get("Johnny") {
  Some(number) => println!("{}", number),
  None => println!("No number"),
}

We can also loop through the elements of a hash map. We get a tuple, rather than individual elements:

for (name, number) in &phone_numbers {
  println!("{}: {}", name, number);
}

Updating hash maps is interesting because of the entry function. This allows us to insert a new value for key, but only if that key doesn't exist. We apply or_insert on the result of entry. In this example, we'll maintain the previous phone number for "Johnny" but add a new one for "Nicholas".

phone_numbers.entry(String::from("Johnny")).
    or_insert(String::from("555-111-1111"));
phone_numbers.entry(String::from("Nicholas")).
    or_insert(String::from("555-111-1111"));

If we want to overwrite the key though, we can use insert. After this example, both keys will use the new phone number.

phone_numbers.insert(
    String::from("Johnny"),
    String::from("555-111-1111"));
phone_numbers.insert(
    String::from("Nicholas"),
    String::from("555-111-1111"));

Lifetimes

There's one more concept we should cover before putting Rust away. This is the idea of lifetimes. Ownership rules get even trickier as your programs get more complicated. Consider this simple function, returning the longer of two strings:

fn longest_string(s1: &String, s2: &String) -> &String {
  if s1.len() > s2.len() {
    s1
  } else {
    s2
  }
}

This seems innocuous enough, but it won't compile! The reason is that Rust doesn't know at compile time which string will get returned. This prevents it from analyzing the ownership of the variables it gets. Consider this invocation of the function:

fn main() {
  let s1 = String::from("A long Hello");
  let result;

  {
    let s2 = String::from("Hello");
    result = longest_string(&s1, &s2);
  }

  println!("{}", result);
}

With this particular set of parameters, things would work out. Since s1 is longer, result would get that reference. And when we print it at the end, s1 is still in scope. But if we flip the strings, then result would refer to s2, which is no longer in scope!

But the longest_string function doesn't know about the scopes of its inputs. And it doesn't know which value gets returned. So it complains at compile time. We can fix this by specifying the lifetimes of the inputs. Here's how we do that:

fn longest_string<'a>(s1: &'a String, s2: &'a String) -> &'a String {
  if s1.len() > s2.len() {
    s1
  } else {
    s2
  }
}

The lifetime annotation 'a is now a template of the function. Each of the types in that line should read "a reference with lifetime 'a' to a string". Both inputs have the same lifetime. Rust assumes this is the smallest common lifetime between them. It states that the return value must have this same (shortest) lifetime.

When we add this specification, our longest_string function compiles. But the main function we have above will not, since it violates the lifetime rules we gave it! By moving the print statement into the block, we can fix it:

fn main() {
  let s1 = String::from("A long Hello");
  let result;

  {
    let s2 = String::from("Hello");
    result = longest_string(&s1, &s2);
    println!("{}", result);
  }
}

The shortest common lifetime is the time inside the block. And we don't use the result of the function outside the block. So everything works now!

It's a little difficult to keep all these rules straight. Luckily, Rust finds all these problems at compile time! So we won't shoot ourselves in the foot and have difficult problems to debug!

Conclusion

That's all for our Rust series! Rust is an interesting language with a lot of potential uses. For more detailed coverage, watch Rust Video Tutorial. You can also read the Rust Book, which has lots of great examples and covers all the important concepts!