Python informer

Improve your Python coding skills

CSV files

Introduction

In this lesson we will look at how to use CSV (Comma Separated s) files in Python:

  • What is CSV?
  • Reading data
  • Writing data

What is CSV?

A CSV file, or comma separated values file, is a special type of file for storing structured data. It can be used to store data records, in a table. CSV files can be used to store spreadsheet data.

The idea of a CSV file is that each line of text stores several values, like a single row of a spreadsheet. For example, here is a CSV containing information about capital cities - the name, the country and the approximate population in millions (in 2017):

Beijing, China, 21
New Delhi, India, 17
Tokyo, Japan, 13
Manila, Philippines, 15
Moscow, Russia, 12

Spreadsheet programs like Calc or Excel can load CSV files. If you load this information into a spreadsheet, it would look like this:

You can also read and wite CSV files using Python, as we will see in this section.

Reading data

To read data from a CSV file, you need to open the file in the normal way.

You then use csv.reader to read the data, and be sure close the file afterwards. You also need to import the csv module:

import csv

f = open('cities.csv')
csv = csv.reader(f)

for line in csv:
    print(line)
    
f.close()

Trying the code

To run this code, you should create a text file in any text editor (such as Windows Notepad or similar), and type in the following lines (or cut and paste to avoid errors):

Beijing, China, 21
New Delhi, India, 17
Tokyo, Japan, 13
Manila, Philippines, 15
Moscow, Russia, 12

Save the file as cities.csv in your Python home folder (the same folder that your Python files are stored in).

Result This part of the code loops through the CSV file, one line at a time:

for line in csv:
    print(line)

If you look at the output, you will see that each time through the loop, the variable line contains a list of that line’s values:

['Beijing', ' China', '21']
['New Delhi', ' India', '17']
['Tokyo', ' Japan', '13']
['Manila', ' Philippines', '15']
['Moscow', ' Russia', '12']

The CSV reader reads the input file, and automatically splits each line when it sees a comma. It stores each line in a list.

Formatting the output

Rather than just printing the output as a Python list, we can access the individual elements to format the data.

  • line[0] is the name of the city
  • line[1] is the name of the country for that city
  • line[2] is the approximate population of the city

So we can change our print code to make it easier to read the values:

for line in csv:
    print('City:', line[0],
          'Country:', line[1],
          'Pop:', line[2])

Of course, you could put more effort into improving the display by lining up the columns etc.

Writing data

To write data from a CSV file, you need to open the file in the normal way.

You then use csv.writer to write rows of data to the file. Each row of data is stored as a list. The elements of the list are written out to file, separated by commas. Each new row is written on its own line.

import csv

f = open('output.csv', 'w', newline='')
csv = csv.writer(f)

for i in range(5):
    data = [i, i*2, i*10]
    csv.writerow(data)
   
f.close()

Understanding the code

We first open the file for writing:

f = open('output.csv', 'w', newline='')

This is fairly standard, but notice that we use the parameter newline=''. This is important - if you don’t use it you will get a blank line between each real line on your output file.

We then set up a csv.writer to write data to the file:

csv = csv.writer(f)

In the loop, we generate some data records. Each record has 3 values, stored in a list. Each record creates one line in the file, and since the loop generates 5 records we will get a 5 line file. We use writerow to write out the data.

for i in range(5):
    data = [i, i*2, i*10]
    csv.writerow(data)

As always, we remember to close the file at the end.

Trying the code

When you run the code, it will create a file called output.csv in you Python folder (the same place your Python source files are stored). You can open the file in a text editor, such as Windows Notepad. You should see something like this:

0,0,0
1,2,10
2,4,20
3,6,30
4,8,40

The lines are generated by the code

data = [i, i*2, i*10]

This code just generates some artificial data so we can see the code working:

On the first line, i is zero, so i*2 is zero, and i*10 is zero.
On the second line, i is 1, so i*2 is 2, and i*10 is 10.
etc.