Objects and identity

By Martin McBride, 2018-03-25

Tags: object identity id type singleton duck typing garbage collection
Categories: python language intermediate python

You will probably have heard Python tutorials mention objects, but what exactly is an object?

In Python, an object anything that you can assign to a variable. Well known types of objects are lists, tuples and strings:

a = [1, 2, 3]
b = ['a', 'b', 'c', 'd']
c = [1, 2, 3]
d = (10, 20, 30)
e = 'xyz'

This code creates 5 different objects, and assigns each one to a variable. Every object has:

a type (the first three objects above are lists, the next is a tuples and the final one is a string)
an identity (there are 5 distinct objects in the code above)
usually some data (the content of the list, etc)

Type

Every object has a type - for example it might be a list, tuple, string, dictionary. Each different type of object has its own characteristics - it has certain methods that you can call, and certain data that it stores.

Once an object has been created, it's type cannot be changed.

You can find the type of an object using the type function:

t = type(a)

If you print the value of t you will see:

<class 'list'>

This indicates that the object of of type list. If you want to check the type of an object you can use code like this:

if type(a) == list:
    print("a is a list")
else if type(a) == tuple:
    print("a is a tuple")
else if type(a) == str:
    print("a is a string")

You can use any other Python class in place of list, tuple or string.

In Python, even numbers are stored as objects, and have a type. For example 1 has type int (type returns <class int>).

t = type(1)

Identity

Every object has a unique identity. You can find the identity of an object using the id function

i = id(a)

The identity is just a number. The actual value has no real significance, all you need to know is:

Every different object has a unique id value
Once an object has been created, its id never changes

Notice that the id has nothing to do with the variable which is associated the object. Try this code:

x = (1, 2, 3)
y = x
print(id(x), id(y))

Variables x and y both reference the same object (the tuple (1, 2, 3)). In Python-speak we say that the names x and y are both bound to the same tuple object. So id(x) and id(y) both return the same value - the identity of the tuple (1, 2, 3).

Data

Every object has its own particular set of data that it stores. A list object stores a list of values. A string contains a text string. A File object might contain information about a particular file on your hard drive.

With most objects, you can change the data using:

methods of the object's class
or built-in functions
or operators

Not all types of object support all these methods. Here is an example of some things you can do with a list:

a = [1, 2, 3]
a.append(4)       #Using append method of the list class
del(a[1])         #Using the built-in del function
a[0] = -1         #Using the [] operator.

The important thing to remember is that changing the data in an object does not change its identity (or its type, of course). You can take a list and add or remove values, it is still the same list object but with different data.

Everything is an object

In Python, everything is an object. Some of these things seem weird or surprising at first, but it all makes sense.

As noted above, numbers are objects. If Python is your first language, this might not seem surprising - if strings are objects, why not numbers? Which is quite true. But if you have used languages such as C, Java or Pascal, numbers are handled as bytes, not objects, for efficiency reasons. With modern computers, saving a few bytes here and there is no longer that important most of the time, so Python and other more forward thinking languages treat numbers as objects.

A function is also an object. This is the basis of functional programming, a particular style of programming that manipulates functions in a similar way to data.

None, which is used to indicate no value, is an object. Its type is NoneType. The NoneType has no data, which means that you don't really need more than one None object - every None is the same as any other. This is called a singleton - every time you type None you will get the same object:

u = None
v = None
print(id(u), id(v))    #The same id, there is only one None object.

True and False are objects of type bool. They are also usually implemented as singletons, although Python doesn't guarantee this.

All type values (str, list, tuple etc we met earlier) are also objects of type type. Since type is a type, then of course type itself is an object (also of type type). type objects are also usually implemented as singletons, but again Python doesn't guarantee it - only None is guaranteed to be a singleton.

Literals as objects

A literal can also be used as an object. For example, here is some code to convert a letter a-f into a number between 0 and 5:

c = 'b'
s = 'abcdef'
n = s.index(c)     #gives value 1

This can be written as:

c = 'b'
n = 'abcdef'.index(c)

Garbage collection

We have seen how to create objects. But every object takes up some space in the computer memory, and memory is finite (although often very large these days). How do we get rid of objects when we no longer need them?

Don't worry, Python takes care of this, automagically. Consider this simplified code:

x = [1, 2, 3]
print(x)
x = 'abc'

The first line creates a list and assigns it to variable x. We print it.

Then we create a string, and assign that to variable x.

What happens to our original list? Well it is still there, using up a little bit of memory, but variable x no longer points at it. In fact, nothing points at it any more, and it is no longer accessible to Python (think about it - what Python code would you write if you wanted to read the list after reassigning x? You can't do it, your code no longer knows where the list is).

Python keeps track of this, and other slightly more complex scenarios, recording any objects which are no longer accessible, and at some convenient point it deletes them and frees the memory for re-use.

Duck typing

Although every object has a type, Python isn't always too concerned with exactly what type an object is. It is often more important to know if an object is capable of being used in the specific situation. For example consider this simple function:

def print_list(k):
    for x in k:
        print(x)

This prints the elements of a list, one per line. But perhaps we should add a test at the start of the function, to check that k is actually a list?

Actually, that wouldn't be a good idea at all. You see, k doesn't need to be a list. The function would work perfectly well if k was a tuple, or a set. In fact, k could be an open file object (the loop would print the lines of the file), or a generator, or some user defined class which supports iteration.

The important thing is not what type the object is, but whether it can be iterated in a for loop. This is called "duck typing" from the expression:

If it looks like a duck, and swims like a duck, and quacks like a duck, it's a duck

In our example, if you can iterate over it, the function will work, we don't care what the type object is.

Another part of the Python philosophy is EAFP:

It is easier to ask forgiveness than permission

This means that, rather than testing the object to try to detect whether you can loop over it, just go ahead run the for loop. If it works, it works. If not, it will throw an exception which your code can then catch and handle.