Python informer

Improve your Python coding skills

Iterators vs iterables

This article is part of a series on functional programming.

If you have studied Python even for a short while, you will probaly have come across the terms iterator, iterable and sequence. They are often used almost interchangeably, but they are all slightly different things.

Iterators

An iterator is very simple. It is any object that has a __next__ method. Each time you call the __next__ method, it will return a value.

For example, the module itertools contains a function count that returns an iterator. The particular iterator you get back from count actually provides a stream of incrementing values 0, 1, 2…

import itertools

it = itertools.count()
print(it.__next__())     # 0
print(it.__next__())     # 1
print(it.__next__())     # 2...

Python often uses double underscores around a function name to indicate that it is a special function that is important to Python. It makes it less likely that you will accidentally override it with one of your own functions. However, if you find that code above a bit ugly, you can use the built in next function to mke it a bit tidier. All next actually does is call __next__ on the object you pass in. Here is the code using next, it does exactly the same thing:

import itertools

it = itertools.count()
print(next(it))          # 0
print(next(it))          # 1
print(next(it))          # 2...

Iterables

An iterable is an object that you can iterate over. A list is an example of an iterable (so are strings, tuples amnd range objects).

Technically, an iterator is an obejct that has and __iter__ method. The __iter__ method returns an iterator. The iterator can be used to get the items in the list, one by one, using the __next__ method.

You rarely need to worry about these details. The most common way to iterate over an iterable is in a for loop:

k = [10, 20, 30, 40]

for x in k:
    print(x)

Here, k is the iterable. The for loop reads the values from the iterable, one at a time, and executes the loop for each value.

for loops under the hood

Now we will take a look at what a for loop actually does. First, we need to get the iterator from the iterable. As we saw above, you use the __iter__ method to do this, but a less ugly alternative is to use the iter function (which just calls the __iter__ method):

k = [10, 20, 30, 40]
it = iter(k)

Now you can use the next function to read the values of the original list via the iterator, one by one:

print(next(it))
print(next(it))
print(next(it))
print(next(it))
print(next(it)) // StopIteration exception

Each time you call next on the iterator it, it fetches the next value from the iterable (the list k). When you reach the end of the iterable, calling next will throw a StopIteration exception. This tells Python that the iterable k has no more values left.

You might wonder why Python throws an exception, rather than providing a function you can call to check if you are at the end of the list. Well, in some cases it isn’t possible to know whether you are at the end of the sequence until you actually try to calculate the next value (we will see an example later). Since our iterator doesn’t calculate the next value until it is asked to, an exception is the best option.

Don’t worry, you will rarely write code like this, you will almost always use for to do the work. In summary, here is what a for loop does when you set it running on an iterable:

  • Uses iter to get the iterator for the iteratable
  • Calls next repeatedly on the iterator, executing the loop each time
  • Catches the StopIteration exception and ends the loop

Looping over an iterator

If you recall, and iterator has a __next__ method, and an iterable has an __iter__ method.

However, most iterators are also iterables. That is, most iterables don’t just have an __next__ method, they have an __iter__ method too!

That means you can call iter on an iterator to find its … iterator. Since it is already an iterator, it just returns itself!

This might seem a bit odd, but it is actually very useful. It allows you to use a for loop with either an __iterable__ or an __iterator__.

Sequences

A sequence is type of iterable that also provides random access to elements. A sequence has some extra methods, for example __getitem__, __len__, __setitem__. Python uses these low level methods to provide various language features, for example:

k = [10, 20, 30, 40]   # A list is a sequence
x = k[1]               # uses __getitem__
len(k)                 # uses __len__
k[2] = 0               # uses __setitem__

For more information see the Python documentation at python.org

This diagram shows the how iterables, iterators and sequences are related. They are all, ultimately, iterables, and any iterable can create an iterator using the iter function.

Generators

You don’t normally need to implement iterators at this low level, you can usually use a generator.

See also

If you found this article useful you might be interested in my ebook Functional Programming in Python.