Numpy introduction


Martin McBride, 2017-05-12
Tags arrays data types vectorisation
Categories numpy

This article is part of a series on numpy. If you find this article useful you might like our Numpy Recipes e-book.

Numpy is a Python package that allows you to efficiently store and process large arrays of numerical data. Obvious examples of this type of data are sound data and image data, but numpy can also be used anywhere you have large data sets to process.

Part of the attraction of numpy is that it uses simple and familiar Python syntax to perform complex operations on arrays, which simplifies your code. The other benefit is that numpy is highly efficient, both in terms of speed and memory usage. These two factors are not unrelated - numpy provides high level array operations, and these operations are efficient because, under the hood, the entire processing loop is written in C.

In this tutorial, we will take a quick tour of numpy arrays.

Before you start, you will need to install numpy. The official numpy.org site will point you at the latest version, with instructions for installing the package.

Import numpy

First, of course, you must import the numpy package. It is common practice to imported numpy as np (so that you can use the short name np in your code). You don't have to, but most people who use numpy do, and will recognise the np prefix.

>>> import numpy as np

Creating numpy arrays

There are a number of ways to create numpy arrays, we will just look at a couple of methods here.

You can create an array of zeros using the zeros function, supplying the required array length:

>>> a = np.zeros(5)
>>> print(a)
[ 0.  0.  0.  0.  0.]

>>> m = np.zeros((3, 4))
>>> print(m)
[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]]

As you can see, we can also create a 2 dimensional array by passing in a tuple such as (3, 4) to specify the number of rows and columns. You can create 3 dimensional array by passing in a tuple with 3 values, etc. You can have as many dimensions as you like.

You can also initialise an array from the values in a list, using the array function:

>>> a = np.array([2, 4, 6, 8])
>>> print(a)
[2 4 6 8]

A multidimensional list will create a multidimensional numpy array:

>>> m = np.array([[1, 2], [3, 4], [5, 6]])
>>> print(m)
[[1 2]
 [3 4]
 [5 6]]

Vectorised operators

When you apply arithmetic operations to numpy arrays, they are automatically applied to each element individually. This is called vectorisation. Here is a simple example:

>>> x = np.array([1, 3, 5, 7])
>>> y = np.array([0, 1, 2, 3])
>>> z = x * y
>>> print(z)
[ 0  3 10 21]

Each element of z is calculated by multiplying together the corresponding elements of x and y:

  • x[0] is 1, y[0] is 0, so z[0] is 0
  • x[1] is 3, y[1] is 1, so z[1] is 3
  • x[2] is 5, y[2] is 2, so z[2] is 10, etc

This make you code a lot neater, but it is also usually faster. The implicit loop is performed in numpy's native C code, which is usually faster than a Python for loop.

Universal functions

Numpy has its own versions of common maths functions like sin, cos, exp etc, that are applied to each element individually. For example:

>>> a = np.array([1, 4, 9, 16])
>>> b = np.sqrt(a)
>>> print(b)
[1. 2. 3. 4.]

This code applies the square root function to all the elements in a, and creates a new numpy array with the results. As with vectorised operators, the implicit loop is performed very efficiently.

Slices

You can slice numpy arrays, just like list. You can also slice multidimensional arrays. For example, this code inserts a 2 row by 4 column array into the middle two rows of a 4 by 4 array

>>> a = np.zeros((4, 4))
>>> b = np.array([[1., 2., 3., 4.], [5., 6., 7., 8.]])
>>> a[1:3] = b
>>> print(a)
[[ 0.  0.  0.  0.]
 [ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 0.  0.  0.  0.]]

You can also insert a 4 by 2 array into a 4 by 4 array:

>>> a = np.zeros((4, 4))
>>> b = np.array([[1., 2.], [3., 4.], [5., 6.], [7., 8.]])
>>> a[:,1:3] = b
>>> print(a)
[[ 0.  1.  2.  0.]
 [ 0.  3.  4.  0.]
 [ 0.  5.  6.  0.]
 [ 0.  7.  8.  0.]]

You can slice in more than one dimension, and copy a slice of one array into a slice of another. This code copies the middle 4 elements of b into the bottom right corner of a:

>>> a = np.zeros((4, 4))
>>> b = np.array([[1., 2.], [3., 4.], [5., 6.], [7., 8.]])
>>> a[2:4, 2:4] = b[1:3]
>>> print(a)
[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  3.  4.]
 [ 0.  0.  5.  6.]]

Data types

Numpy uses homogeneous arrays (all the elements must be the same type). This is different to a Python list, where different elements of the same list can have different types.

By default, when you create an array with the zeros function, it will contain floating point values. You can choose a different type by using the dtype parameter. For example, this creates an array of 16 bit integer values.

>>> a = np.ones(4, dtype=np.int16)
>>> print(a)
[1 1 1 1]

If you create an array using the array function, the data type will depend on the types in the source list. If the source list is all integers, the numpy array will contain ints. If the list is floats, the array will contain floats. If the list is a mixture, the array will contain all floats, with the integer values converted to float. Once again, you can use the dtype parameter to override this.

Arrays filled with a value range

You can use the arange function to fill an array with a range of values:

>>> a = np.arange(5)
>>> print(a)
[0 1 2 3 4]

This function can be used with optional start and step arguments, just like the standard range function. But you can also use float values:

>>> a = np.arange(1.0, 3.0, .3)
>>> print(a)
[ 1.   1.3  1.6  1.9  2.2  2.5  2.8]

An alternative function, linspace, allows you to specify the exact start and end values, and the exact number of elements, and it will calculate the increment between the values:

>>> a = np.linspace(1.0, 3.0, 5)
>>> print(a)
[ 1.   1.5  2.   2.5  3. ]

This has just been a quick introduction to numpy arrays. You can learn more by following the more detailed articles in the rest of this tutorial, or by visiting the numpy.org site.

Visit the PythonInformer Discussion Forum for numeric Python.


Tag cloud

2d arrays abstract data type alignment and array arrays bezier curve built-in function close closure colour comparison operator comprehension context conversion data types device space dictionary duck typing efficiency encryption enumerate filter font font style for loop function function plot functools generator gif gradient html image processing imagesurface immutable object index input installing iter iterator itertools lambda function len linspace list list comprehension logical operator lru_cache mandelbrot map mutability named parameter numeric python numpy object open operator optional parameter or path positional parameter print pure function radial gradient range recursion reduce rotation scaling sequence slice slicing sound spirograph str stream string subpath symmetric encryption template text text metrics transform translation transparency tuple unpacking user space vectorisation webserver website while loop zip

Copyright (c) Axlesoft Ltd 2020