Using zip in a for loop

By Martin McBride, 2022-09-18
Tags: zip zip_longest for loop
Categories: python language intermediate python


We sometimes need to loop over two different sequences at the same time. For example, if we had two lists:

colors = ["red", "green", "blue"]
shapes = ["circle", "square", "triangle"]

and we wanted to print these lists side by side:

red circle
green square
blue triangle

We could do this using a loop counter, but a better way is to use the zip function, as we will see.

Using a loop counter

We can just loop three times and use the counter to index the lists:

for i in range(len(colors)):
    print(colors[i], shapes[i])

This works, and produces the required result, but is not generally considered to be Pythonic code. It is usually best to avoid loop counters wherever possible.

The solution above has several problems. The code is more complex than it needs to be. It also only works with sequence types (such as lists, tuples or strings), it doesn't work with lazy iterators. And finally, it might not work if the sequences have different lengths.

Fortunately, you don't have to forget everything you have learnt so far - you can use the zip function. This allows us access 2 or more sequences within a Pythonic for loop.

A better solution - using zip

The zip function can be used to loop over 2 (or more) sequences at the same time. It is used like this:

for c, s in zip(colors, shapes):
    print(c, s)

In this code, on each pass through the loop the c variable steps through the colors one by one, and the s variable steps through the shapes. So it prints the same output as before:

red circle
green square
blue triangle

But the code is simpler and more declarative. It says exactly what it is doing, with no extra loop counters as distractions.

How zip works

We can see how zip works with the following code:

for t in zip(colors, shapes):
    print(t)

This creates the following output:

("red", "circle")
("green", "square")
("blue", "triangle")

The zip function is named after the behaviour of a clothes zipper. In code, zip means joining two sequences, element by element. So the two lists:

["red",
 "green",
 "blue"]

["circle",
 "square",
 "triangle"]

become a sequence of tuples as shown above.

Instead of using a tuple t, we can unpack the tuple into separate variables, c and s:

for t in zip(colors, shapes):
    c, s = t
    print(c, s)

Finally, we can move the unpacking step to be part of the for statement, which gives us the final code:

for c, s in zip(colors, shapes):
    print(c, s)

This diagram illustrates the process:

Zipping more than 2 sequences

We can zip 3 (or more) sequences. In this example we have introduced a sequence count:

count = [2, 10, 5]
for c, s, n in zip(colors, shapes, count):
    print(c, s, n)

We have added an extra list, count, that contains some numbers. We would now like to print a list of all three sequences, side by side.

Fortunately, the zip function can accept any number of arguments, so we can simply add count as an extra argument. This means that zip will now create a list of tuples where each tuple contains 3 values - a color, a shape, and a count.

When we unpack these tuples, we must provide 3 variables to receive the values. The number of variables must always equal the number of arguments passed into the zip function.

The output of this code will be:

red circle 2
green square 10
blue triangle 5

Zipping sequences with unequal lengths

What would happen if we supplied the zip function with a set of sequences that didn't have equal lengths? Let's see:

count = [2, 10, 5, 7]
for c, n in zip(colors, count):
    print(c, n)

In this case, colors has 3 values, as usual, but we have supplied a count list with 4 values. Here is the output:

red 2
green 10
blue 5

zip only outputs 3 values. The output of zip is controlled by the length of the shortest input sequence, which in this case is colors with a length of 3. Only 3 values are created, and the extra element of count is ignored.

Using zip_longest

If we wanted to include all the elements of count, we could use the zip_longest function:

from itertools import zip_longest

count = [2, 10, 5, 7]
for c, n in zip_longest(colors, count):
    print(c, n)

Notice that zip_longest isn't a built-in function, it is part of the itertools module. We need to import it before we can use it.

The output of zip_longest is controlled by the length of the longest input sequence, which in this case is count with a length of 4. So 4 values are created. Since colors only contains 3 values, a value of None is used in place of the missing value.

red 2
green 10
blue 5
None 7

Unzipping

While we are looking at zip, it is worth mentioning that we can also unzip a set of values. Here is how we do it:

zipped = [("red", "circle"), ("green", "square"), ("blue", "triangle")]
colors2, shapes2 = zip(*zipped)
print(colors2, shapes2)

zipped contains our data. To unzip the data, we use the zip function, but we pass *zipped into it. The output is:

('red', 'green', 'blue') ('circle', 'square', 'triangle')

How does this work? Well, the asterisk operator * unpacks a sequence and passes each value into the function as a separate argument. This means that:

zip(*zipped)

is equivalent to passing three separate arguments into zip (see calling functions):

zip(("red", "circle"), ("green", "square"), ("blue", "triangle"))

If we zip these values together, it creates two tuples. The first contains the 3 first values of the input parameters (ie the colours). The second contains the 3 second values of the input parameters (the shapes). Which effectively unzips the data.

Other useful techniques

List comprehension are another useful way to process iterables, especially if you need the result in the form of a list.

The itertools module contains several useful variants of zip and similar functions.

See also

If you found this article useful, you might be interested in the book NumPy Recipes or other books by the same author.

Join the PythonInformer Newsletter

Sign up using this form to receive an email when new content is added:

Popular tags

2d arrays abstract data type alignment and angle animation arc array arrays bar chart bar style behavioural pattern bezier curve built-in function callable object chain circle classes clipping close closure cmyk colour combinations comparison operator comprehension context context manager conversion count creational pattern data science data types decorator design pattern device space dictionary drawing duck typing efficiency ellipse else encryption enumerate fill filter font font style for loop formula function function composition function plot functools game development generativepy tutorial generator geometry gif global variable gradient greyscale higher order function hsl html image image processing imagesurface immutable object in operator index inner function input installing iter iterable iterator itertools join l system lambda function latex len lerp line line plot line style linear gradient linspace list list comprehension logical operator lru_cache magic method mandelbrot mandelbrot set map marker style matplotlib monad mutability named parameter numeric python numpy object open operator optimisation optional parameter or pandas partial application path pattern permutations pie chart pil pillow polygon pong positional parameter print product programming paradigms programming techniques pure function python standard library radial gradient range recipes rectangle recursion reduce regular polygon repeat rgb rotation roundrect scaling scatter plot scipy sector segment sequence setup shape singleton slice slicing sound spirograph sprite square str stream string stroke structural pattern subpath symmetric encryption template tex text text metrics tinkerbell fractal transform translation transparency triangle truthy value tuple turtle unpacking user space vectorisation webserver website while loop zip zip_longest