# Data types

Martin McBride, 2019-09-14
Tags data types efficiency
Categories numpy

numpy supports five main data types - ints, unsigned ints, floats, complex numbers, and booleans.

## Integers

Integers in Python can represent positive or negative numbers of any size. That is because Python integers are objects, and the implementation automatically grabs more memory if necessary to store very large values.

Integers in numpy are very different. An integer occupies a fixed number of bytes. For example, the type `np.int32` occupies exactly 4 byte of memory (A byte contains 8 bits, so 4 bytes is 32 bits, hence `int32`). These are called primitive types because they aren't object, they are just data bytes stored directly in memory.

The reasons for using primitive types is explained in detail in the article on numpy efficiency. In summary:

• An arrays of primitive types takes a lot less memory than a list of Python integer objects.
• Accessing primitive values is faster.
• Primitive types don't require garbage collection.

In fact, numpy provides several different integer sizes:

Yype Bytes Range
np.int8 1 -128 to 127
np.int16 2 -32768 to 32767
np.int32 4 -2147483648 to 2147483647
np.int64 8 -9223372036854775808 to 9223372036854775807

There are a couple of reasons for this. The first is fairly obvious, if you are using data that has a limited range there is no point using more memory than you need. For example, sound data is often stored using 16 bits per sample (ie the sound is represented by an array of 16 bit values). Storing this data as 64 bit integers would make no sense, you would be using 4 times a much memory for no reason.

The second reason is slightly less obvious. Some applications use a mix of Python and C code for efficiency. With numpy, it is possible to pass a pointer to the array data into a C function, so that the C code can access the data in memory without the need to make a copy of it. This can improve efficiency when dealing with very large arrays. For this to work, the data needs to be stored in the format the C code is expecting. So if the C code is expecting an array of 16 bit integers, it is useful to be able to specify that in numpy. We won't be covering that in these tutorials, it is quite specialised.

## Unsigned integers

Unsigned integers are similar to normal integers, but they can only hold non-zero values. Here are the available types:

Type Bytes Range
np.uint8 1 0 to 255
np.uint16 2 0 to 65535
np.uint32 4 0 to 4294967295
np.uint64 8 0 to 18446744073709551615

Unsigned integers are useful for data that can never be negative, for example population data. The population of a town can never be less than zero.

The advantage of unsigned data is that it can represent larger positive numbers than signed data. An `int8` goes up to 127, but a `uint8` goes up to 255.

## Floats

numpy floating point numbers also have different sizes (usually called precisions). There are two types:

Type Bytes Range Precision
np.float32 4 ±1.18×10−38 to ±3.4×1038 7 to 8 decimal digits
np.float64 8 ±2.23×10−308 to ±1.80×10308 15 to 16 decimal digits

`float64` numbers store floating point numbers in the same way as a Python `float` value. They are sometimes called double precision.

`float32` numbers take half as much storage as `float64`, but they have considerably smaller range and . They are sometimes called single precision.

## Complex numbers

A complex number consist of two floating point numbers, on representing the real part and one representing the imaginary part. If you have not met complex numbers before, here is a wikipedia article.

Type Bytes Precision
np.complex64 8 Two 32-bit floats
np.complex128 16 Two 64-bit floats

`complex128` is equivalent to the Python `complex` type.

## Booleans

numpy supports boolean values `np.bool`. A `bool` is one byte in size, with 0 representing false, and any non-zero value representing true.

## Setting the data type

All of the functions available for created numpy arrays have an optional parameter `dtype` that allows you to specify the data type (such as `np.uint8` or `np.float64` etc). For example:

```a = np.zeros((2, 3), dtype=np.int32)
```

Creates an array that is 2 rows by 3 columns of zeros with data type int32:

```[[0 0 0]
[0 0 0]]
```

## System dependent types

numpy also provides a numpy of types that don't specify a particular size. These include `np.byte`, `np.short`, `np.int`, `np.long`, amongst others. There are also unsigned versions `np.ubyte`, `np.ushort` etc.

These types have system dependent sizes. For example `np.int` might be equivalent to `np.int32` or `np.int64` depending on the system it is running on. It depends on the type of processor, the type of operating system, and perhaps the version of the operating system.

In general, don't use these types. They are provided for situations where numpy is passing data in memory to a library written in C. For historical reasons, C has always had system dependent types like `int` and `short` whose exact size can vary between systems. If you were interfacing to such a library you would need to use compatible types. Unless you are using any libraries that specifically tell you to use these types, don't use them. Stick to the fixed-size types shown above instead.

## Data ordering

Some functions (such as `zeros` used above) allow you to select an `order` for the data. The choices are C-style or Fortran-style ordering (sometimes a couple of other variants too). Again, these options are intended for use if you are passing data in memory to a library written in C (or even Fortran). Unless you have good reason to change it, just use the default option.

Visit the PythonInformer Discussion Forum for numeric Python.

If you found this article useful, you might be interested in the book NumPy Recipes or other books by the same author.

#### Popular tags

2d arrays abstract data type alignment and animation arc array arrays bezier curve built-in function callable object circle classes close closure cmyk colour comparison operator comprehension context context manager conversion creational pattern data types design pattern device space dictionary drawing duck typing efficiency else encryption enumerate fill filter font font style for loop function function composition function plot functools game development generativepy tutorial generator geometry gif gradient greyscale higher order function hsl html image image processing imagesurface immutable object index inner function input installing iter iterable iterator itertools l system lambda function len line linspace list list comprehension logical operator lru_cache magic method mandelbrot mandelbrot set map monad mutability named parameter numeric python numpy object open operator optional parameter or partial application path polygon positional parameter print pure function pycairo radial gradient range recipes rectangle recursion reduce rgb rotation scaling sector segment sequence singleton slice slicing sound spirograph sprite square str stream string stroke subpath symmetric encryption template text text metrics tinkerbell fractal transform translation transparency tuple turtle unpacking user space vectorisation webserver website while loop zip