diff --git a/src/doc/guide.md b/src/doc/guide.md index ecde51538e2c7..83438714fbe5b 100644 --- a/src/doc/guide.md +++ b/src/doc/guide.md @@ -4049,9 +4049,343 @@ Doing this is not particularly common, but every once in a while, it's useful. That's all you need to get the hang of closures! Closures are a little bit strange at first, but once you're used to using them, you'll miss them in any language that doesn't have them. Passing functions to other functions is -incredibly powerful. Next, let's look at one of those things: iterators. +incredibly powerful. Next, let's look at one of those things: iterators. -# iterators +# Iterators + +Let's talk about loops. + +Remember Rust's `for` loop? Here's an example: + +```{rust} +for x in range(0i, 10i) { + println!("{:d}", x); +} +``` + +Now that you know more Rust, we can talk in detail about how this works. The +`range` function returns an **iterator**. An iterator is something that we can +call the `.next()` method on repeatedly, and it gives us a sequence of things. + +Like this: + +```{rust} +let mut range = range(0i, 10i); + +loop { + match range.next() { + Some(x) => { + println!("{}", x); + } + None => { break } + } +} +``` + +We make a mutable binding to the return value of `range`, which is our iterator. +We then `loop`, with an inner `match`. This `match` is used on the result of +`range.next()`, which gives us a reference to the next value of the iterator. +`next` returns an `Option`, in this case, which will be `Some(int)` when +we have a value and `None` once we run out. If we get `Some(int)`, we print it +out, and if we get `None`, we `break` out of the loop. + +This code sample is basically the same as our `for` loop version. The `for` +loop is just a handy way to write this `loop`/`match`/`break` construct. + +`for` loops aren't the only thing that uses iterators, however. Writing your +own iterator involves implementing the `Iterator` trait. While doing that is +outside of the scope of this guide, Rust provides a number of useful iterators +to accomplish various tasks. Before we talk about those, we should talk about a +Rust anti-pattern. And that's `range`. + +Yes, we just talked about how `range` is cool. But `range` is also very +primitive. For example, if you needed to iterate over the contents of +a vector, you may be tempted to write this: + +```{rust} +let nums = vec![1i, 2i, 3i]; + +for i in range(0u, nums.len()) { + println!("{}", nums[i]); +} +``` + +This is strictly worse than using an actual iterator. The `.iter()` method on +vectors returns an iterator which iterates through a reference to each element +of the vector in turn. So write this: + +```{rust} +let nums = vec![1i, 2i, 3i]; + +for num in nums.iter() { + println!("{}", num); +} +``` + +There are two reasons for this. First, this is more semantic. We iterate +through the entire vector, rather than iterating through indexes, and then +indexing the vector. Second, this version is more efficient: the first version +will have extra bounds checking because it used indexing, `nums[i]`. But since +we yield a reference to each element of the vector in turn with the iterator, +there's no bounds checking in the second example. This is very common with +iterators: we can ignore unnecessary bounds checks, but still know that we're +safe. + +There's another detail here that's not 100% clear because of how `println!` +works. `num` is actually of type `&int`, that is, it's a reference to an `int`, +not an `int` itself. `println!` handles the dereferencing for us, so we don't +see it. This code works fine too: + +```{rust} +let nums = vec![1i, 2i, 3i]; + +for num in nums.iter() { + println!("{}", *num); +} +``` + +Now we're explicitly dereferencing `num`. Why does `iter()` give us references? +Well, if it gave us the data itself, we would have to be its owner, which would +involve making a copy of the data and giving us the copy. With references, +we're just borrowing a reference to the data, and so it's just passing +a reference, without needing to do the copy. + +So, now that we've established that `range` is often not what you want, let's +talk about what you do want instead. + +There are three broad classes of things that are relevant here: iterators, +**iterator adapters**, and **consumers**. Here's some definitions: + +* 'iterators' give you a sequence of values. +* 'iterator adapters' operate on an iterator, producing a new iterator with a + different output sequence. +* 'consumers' operate on an iterator, producing some final set of values. + +Let's talk about consumers first, since you've already seen an iterator, +`range`. + +## Consumers + +A 'consumer' operates on an iterator, returning some kind of value or values. +The most common consumer is `collect()`. This code doesn't quite compile, +but it shows the intention: + +```{rust,ignore} +let one_to_one_hundred = range(0i, 100i).collect(); +``` + +As you can see, we call `collect()` on our iterator. `collect()` takes +as many values as the iterator will give it, and returns a collection +of the results. So why won't this compile? Rust can't determine what +type of things you want to collect, and so you need to let it know. +Here's the version that does compile: + +```{rust} +let one_to_one_hundred = range(0i, 100i).collect::>(); +``` + +If you remember, the `::<>` syntax allows us to give a type hint, +and so we tell it that we want a vector of integers. + +`collect()` is the most common consumer, but there are others too. `find()` +is one: + +```{rust} +let one_to_one_hundred = range(0i, 100i); + +let greater_than_forty_two = range(0i, 100i) + .find(|x| *x >= 42); + +match greater_than_forty_two { + Some(_) => println!("We got some numbers!"), + None => println!("No numbers found :("), +} +``` + +`find` takes a closure, and works on a reference to each element of an +iterator. This closure returns `true` if the element is the element we're +looking for, and `false` otherwise. Because we might not find a matching +element, `find` returns an `Option` rather than the element itself. + +Another important consumer is `fold`. Here's what it looks like: + +```{rust} +let sum = range(1i, 100i) + .fold(0i, |sum, x| sum + x); +``` + +`fold()` is a consumer that looks like this: +`fold(base, |accumulator, element| ...)`. It takes two arguments: the first +is an element called the "base". The second is a closure that itself takes two +arguments: the first is called the "accumulator," and the second is an +"element." Upon each iteration, the closure is called, and the result is the +value of the accumulator on the next iteration. On the first iteration, the +base is the value of the accumulator. + +Okay, that's a bit confusing. Let's examine the values of all of these things +in this iterator: + +| base | accumulator | element | closure result | +|------|-------------|---------|----------------| +| 0i | 0i | 1i | 1i | +| 0i | 1i | 2i | 3i | +| 0i | 3i | 3i | 6i | + +We called `fold()` with these arguments: + +```{rust} +# range(1i, 5i) +.fold(0i, |sum, x| sum + x); +``` + +So, `0i` is our base, `sum` is our accumulator, and `x` is our element. On the +first iteration, we set `sum` to `0i`, and `x` is the first element of `nums`, +`1i`. We then add `sum` and `x`, which gives us `0i + 1i = 1i`. On the second +iteration, that value becomes our accumulator, `sum`, and the element is +the second element of the array, `2i`. `1i + 2i = 3i`, and so that becomes +the value of the accumulator for the last iteration. On that iteration, +`x` is the last element, `3i`, and `3i + 3i = 6i`, which is our final +result for our sum. `1 + 2 + 3 = 6`, and that's the result we got. + +Whew. `fold` can be a bit strange the first few times you see it, but once it +clicks, you can use it all over the place. Any time you have a list of things, +and you want a single result, `fold` is appropriate. + +Consumers are important due to one additional property of iterators we haven't +talked about yet: laziness. Let's talk some more about iterators, and you'll +see why consumers matter. + +## Iterators + +As we've said before, an iterator is something that we can call the `.next()` +method on repeatedly, and it gives us a sequence of things. Because you need +to call the method, this means that iterators are **lazy**. This code, for +example, does not actually generate the numbers `1-100`, and just creates a +value that represents the sequence: + +```{rust} +let nums = range(1i, 100i); +``` + +Since we didn't do anything with the range, it didn't generate the sequence. +Once we add the consumer: + +```{rust} +let nums = range(1i, 100i).collect::>(); +``` + +Now, `collect()` will require that `range()` give it some numbers, and so +it will do the work of generating the sequence. + +`range` is one of two basic iterators that you'll see. The other is `iter()`, +which you've used before. `iter()` can turn a vector into a simple iterator +that gives you each element in turn: + +```{rust} +let nums = [1i, 2i, 3i]; + +for num in nums.iter() { + println!("{}", num); +} +``` + +These two basic iterators should serve you well. There are some more +advanced iterators, including ones that are infinite. Like `count`: + +```{rust} +std::iter::count(1i, 5i); +``` + +This iterator counts up from one, adding five each time. It will give +you a new integer every time, forever. Well, technically, until the +maximum number that an `int` can represent. But since iterators are lazy, +that's okay! You probably don't want to use `collect()` on it, though... + +That's enough about iterators. Iterator adapters are the last concept +we need to talk about with regards to iterators. Let's get to it! + +## Iterator adapters + +"Iterator adapters" take an iterator and modify it somehow, producing +a new iterator. The simplest one is called `map`: + +```{rust,ignore} +range(1i, 100i).map(|x| x + 1i); +``` + +`map` is called upon another iterator, and produces a new iterator where each +element reference has the closure it's been given as an argument called on it. +So this would give us the numbers from `2-101`. Well, almost! If you +compile the example, you'll get a warning: + +```{notrust,ignore} +2:37 warning: unused result which must be used: iterator adaptors are lazy and + do nothing unless consumed, #[warn(unused_must_use)] on by default + range(1i, 100i).map(|x| x + 1i); + ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +``` + +Laziness strikes again! That closure will never execute. This example +doesn't print any numbers: + +```{rust,ignore} +range(1i, 100i).map(|x| println!("{}", x)); +``` + +If you are trying to execute a closure on an iterator for its side effects, +just use `for` instead. + +There are tons of interesting iterator adapters. `take(n)` will get the +first `n` items out of an iterator, and return them as a list. Let's +try it out with our infinite iterator from before, `count()`: + +```{rust} +for i in std::iter::count(1i, 5i).take(5) { + println!("{}", i); +} +``` + +This will print + +```{notrust,ignore} +1 +6 +11 +16 +21 +``` + +`filter()` is an adapter that takes a closure as an argument. This closure +returns `true` or `false`. The new iterator `filter()` produces returns +only the elements that that closure returned `true` for: + +```{rust} +for i in range(1i, 100i).filter(|x| x % 2 == 0) { + println!("{}", i); +} +``` + +This will print all of the even numbers between one and a hundred. + +You can chain all three things together: start with an iterator, adapt it +a few times, and then consume the result. Check it out: + +```{rust} +range(1i, 1000i) + .filter(|x| x % 2 == 0) + .filter(|x| x % 3 == 0) + .take(5) + .collect::>(); +``` + +This will give you a vector containing `6`, `12`, `18`, `24`, and `30`. + +This is just a small taste of what iterators, iterator adapters, and consumers +can help you with. There are a number of really useful iterators, and you can +write your own as well. Iterators provide a safe, efficient way to manipulate +all kinds of lists. They're a little unusual at first, but if you play with +them, you'll get hooked. For a full list of the different iterators and +consumers, check out the [iterator module documentation](std/iter/index.html). # Generics