Python informer

Improve your Python coding skills

Iterators

This article is part of a series on functional programming.

Programmers often need to deal with lists of values, and most languages provide data structures for this purpose (in Python the type list is usually used).

However, in functional programming we don’t usually deal with lists directly. Instead we use an iterator that fetches values from a list or other sequence. This has two advantages:

  • An iterator can read from a list but it can’t modify the list. So if you pass an iterator into a function, you can be sure that the function will not change the list. This is important because in functional programming we prefer to use functions that don’t have any side effects (and modifying a list passed into the function is a side effect).
  • Iterators support lazy evaluation, which we will explore later in this article.

Iterables

A Python lists, tuples and strings are all types of iterable object. This means that you can iterate over the object, reading the values in order.

Iterators and iterables are slightly different things, as you will see in the next article in this series.

The most common way to iterate over an iterable is in a for loop:

k = [10, 20, 30, 40]

for x in k:
    print(x)

The for loop reads the values from the iterable, one at a time, and executes the loop for each value.

Lazy evaluation

Now we will use a slightly more complex example to illustrate what is meant by lazy evaluation. We will use the built-in map function that accepts a function and an iterable. It return an iterator that applies the function to each item in the input iterable.

def print_all(it):
    for x in it:
        print(x)

def square(x):
    return x*x

range_it = range(10)
map_it = map(square, range_it)
print_all(map_it)

Here we define function square that calculates the square of x.

We call the range function to create a sequence of values 0 … 9. range returns an iterable that will produce those values when they are requested.

The map function applies square to the values in range_it. The result of this is an iterator map_it. The iterator will produce the sequence 1, 4, 9, 16 … 81 (the squares of values in the original list).

We then use the previously defined function, print_all to print the values. It will print 1, 4, 9, 16 … 81.

Although it is simple enough to understand what the code will do, the order of execution is quite interesting:

1. When we call the range function, it simply returns an iterable. It doesn’t create the values yet.

2. When we call the map function, it simply returns an iterator. It doesn’t do any calculations. The square function doesn’t get called at this stage.

3. The print_all function accepts the map_it iterator and loops over its values.

Each time through the for loop in print_all, the following happens:

  • The for loop requests the next value from the map_it iterator.
  • The map_it iterator requests the next value from the range_it iterator.
  • The map_it iterator applies the square function to the value, and returns the result.
  • The print_all function prints the result.
  • Loop back until finished…

What the code has done is to set up a chain of linked iterators from the print_all for loop right back to the range function. This means that each number in the range isn’t created until it is needed. The complete list of values is never stored in memory, and the first value is printed almost immediately.

Lazy evaluation and large sequences

To test this, let’s make the range much bigger:

range_it = range(10**1000)

10 to the power of 1000 is a truly huge number - a one followed by 1000 zeros. Your computer can’t store a list that big (it is larger than the number of atoms in the universe), and it would never be able to fill the list with numbers (it would almost certainlytake longer than the predicted lifetime of the universe, however fast computers might get).

So if Python actually tried to create this range as a list, the program would never get past this step.

But in fact, all the range function does is create an iterable that promises to create all those numbers in the future. And that takes almost no time at all.

Try running the program with this new range. It will start printing values immediately. Just don’t expect it to finish any time soon!

See also

If you found this article useful you might be interested in my ebook Functional Programming in Python.