Data types

Martin McBride, 2019-09-14
Tags data types efficiency
Categories numpy

This article is part of a series on numpy. If you find this article useful you might like our Numpy Recipes e-book.

numpy supports five main data types - ints, unsigned ints, floats, complex numbers, and booleans.


Integers in Python can represent positive or negative numbers of any size. That is because Python integers are objects, and the implementation automatically grabs more memory if necessary to store very large values.

Integers in numpy are very different. An integer occupies a fixed number of bytes. For example, the type np.int32 occupies exactly 4 byte of memory (A byte contains 8 bits, so 4 bytes is 32 bits, hence int32). These are called primitive types because they aren't object, they are just data bytes stored directly in memory.

The reasons for using primitive types is explained in detail in the article on numpy efficiency. In summary:

  • An arrays of primitive types takes a lot less memory than a list of Python integer objects.
  • Accessing primitive values is faster.
  • Primitive types don't require garbage collection.

In fact, numpy provides several different integer sizes:

|Yype |Bytes|Range| |--------|-----|-----| |np.int8 |1 |-128 to 127| |np.int16|2 |-32768 to 32767| |np.int32|4 |-2147483648 to 2147483647| |np.int64|8 |-9223372036854775808 to 9223372036854775807|

There are a couple of reasons for this. The first is fairly obvious, if you are using data that has a limited range there is no point using more memory than you need. For example, sound data is often stored using 16 bits per sample (ie the sound is represented by an array of 16 bit values). Storing this data as 64 bit integers would make no sense, you would be using 4 times a much memory for no reason.

The second reason is slightly less obvious. Some applications use a mix of Python and C code for efficiency. With numpy, it is possible to pass a pointer to the array data into a C function, so that the C code can access the data in memory without the need to make a copy of it. This can improve efficiency when dealing with very large arrays. For this to work, the data needs to be stored in the format the C code is expecting. So if the C code is expecting an array of 16 bit integers, it is useful to be able to specify that in numpy. We won't be covering that in these tutorials, it is quite specialised.

Unsigned integers

Unsigned integers are similar to normal integers, but they can only hold non-zero values. Here are the available types:

|Type |Bytes|Range| |---------|-----|-----| |np.uint8 |1 |0 to 255| |np.uint16|2 |0 to 65535| |np.uint32|4 |0 to 4294967295| |np.uint64|8 |0 to 18446744073709551615|

Unsigned integers are useful for data that can never be negative, for example population data. The population of a town can never be less than zero.

The advantage of unsigned data is that it can represent larger positive numbers than signed data. An int8 goes up to 127, but a uint8 goes up to 255.


numpy floating point numbers also have different sizes (usually called precisions). There are two types:

|Type |Bytes|Range|Precision| |----------|-----|-----|---------| |np.float32|4 |±1.18×10−38 to ±3.4×1038|7 to 8 decimal digits| |np.float64|8 |±2.23×10−308 to ±1.80×10308|15 to 16 decimal digits|

float64 numbers store floating point numbers in the same way as a Python float value. They are sometimes called double precision.

float32 numbers take half as much storage as float64, but they have considerably smaller range and . They are sometimes called single precision.

Complex numbers

A complex number consist of two floating point numbers, on representing the real part and one representing the imaginary part. If you have not met complex numbers before, here is a wikipedia article.

|Type |Bytes|Precision| |-------------|-----|---------| |np.complex64 |8 |Two 32-bit floats| |np.complex128|16 |Two 64-bit floats|

complex128 is equivalent to the Python complex type.


numpy supports boolean values np.bool. A bool is one byte in size, with 0 representing false, and any non-zero value representing true.

Setting the data type

All of the functions available for created numpy arrays have an optional parameter dtype that allows you to specify the data type (such as np.uint8 or np.float64 etc). For example:

a = np.zeros((2, 3), dtype=np.int32)

Creates an array that is 2 rows by 3 columns of zeros with data type int32:

[[0 0 0]
 [0 0 0]]

System dependent types

numpy also provides a numpy of types that don't specify a particular size. These include np.byte, np.short,, np.long, amongst others. There are also unsigned versions np.ubyte, np.ushort etc.

These types have system dependent sizes. For example might be equivalent to np.int32 or np.int64 depending on the system it is running on. It depends on the type of processor, the type of operating system, and perhaps the version of the operating system.

In general, don't use these types. They are provided for situations where numpy is passing data in memory to a library written in C. For historical reasons, C has always had system dependent types like int and short whose exact size can vary between systems. If you were interfacing to such a library you would need to use compatible types. Unless you are using any libraries that specifically tell you to use these types, don't use them. Stick to the fixed-size types shown above instead.

Data ordering

Some functions (such as zeros used above) allow you to select an order for the data. The choices are C-style or Fortran-style ordering (sometimes a couple of other variants too). Again, these options are intended for use if you are passing data in memory to a library written in C (or even Fortran). Unless you have good reason to change it, just use the default option.

Visit the PythonInformer Discussion Forum for numeric Python.

Tag cloud

2d arrays abstract data type alignment and array arrays bezier curve built-in function close closure colour comparison operator comprehension context conversion data types design pattern device space dictionary duck typing efficiency encryption enumerate filter font font style for loop function function composition function plot functools generator gif gradient greyscale higher order function html image processing imagesurface immutable object index inner function input installing iter iterator itertools lambda function len linspace list list comprehension logical operator lru_cache mandelbrot map monad mutability named parameter numeric python numpy object open operator optional parameter or partial application path positional parameter print pure function radial gradient range recursion reduce rgb rotation scaling sequence slice slicing sound spirograph str stream string subpath symmetric encryption template text text metrics transform translation transparency tuple unpacking user space vectorisation webserver website while loop zip

Copyright (c) Axlesoft Ltd 2020