Skip to main content

How to read data from excel or spreadsheet file with Python

We all are used to managing data using Excel sheets or spreadsheets, sometimes it becomes necessary for us to use the data stored in excel sheet for some computations using python

In this tutorial we will be reading the data in an excel file using python xlrd module.

According to official documentation at pypi xlrd is:
xlrd module is a library to extract data from Excel sheets or spreadsheet files.
Before we start we will have to install this module so that it can be used in our python script. If you have not yet installed this module on your computer then you can do the following:
$ pip install xlrd
You can also install using easy_install or by downloading the package from pypi, all of them have the same effect.

We will start with IDLE so that we can understand each and every command in the interpreter easily.

We will using roc.xls for this tutorial and the spreadsheet(roc.xls) contains the following data:
Roll NumberNameMarksRank
1Pearson864
2John893
3Habib645
4Venkat982
5Suri1001
First import xlrd module:
>>> import xlrd
Now we will open the workbook:
>>> wb = xlrd.open_workbook('roc.xls')
We will see the list of sheets present in the workbook:
>>> wb = xlrd.open_workbook('roc.xls')
>>> sh = wb.sheet_names()
>>> sh
[u'Sheet1']
Next we will open the sheet(or worksheet) in the spreadsheet.
Opening the sheet by name:
>>> sh = wb.sheet_by_name('Sheet1')
Opening the sheet by index:
>>> sh = wb.sheet_by_index(0)
After we have opened the sheet we will have to read the data as this is our main goal. You can do that in many ways:
Read data from a cell:
>>> sh.cell_value(0,2) #sh.cell_value(row,column)
u'Marks'
>>> sh.cell(1,2).value #sh.cell(row,column).value
86.0
Read data from each row:
>>> sh.row_values(2)
[2.0, u'John', 89.0, 3.0]
Read data from a column:
>>> sh.col_values(2)
[u'Marks', 86.0, 89.0, 64.0, 98.0, 100.0]
Finding the number of columns present in the spreadsheet:
>>> sh.ncols
4
Finding the number of rows present in the spreadsheet:
>>> sh.nrows
6
If we want to get the whole data, then we can use for loop to loop through cells or rows or columns. This can be done as follows:
For loop through cells:
>>> for i in range(sh.nrows):
 for j in range(sh.ncols):
  print sh.cell_value(i,j) #Or can be written as sh.cell(i,j.value)

  
Roll Number
Name
Marks
Rank
1.0
Pearson
86.0
4.0
2.0
John
89.0
3.0
3.0
Habib
64.0
5.0
4.0
Venkat
98.0
2.0
5.0
Suri
100.0
1.0
For loop through rows:
>>> for i in range(sh.nrows):
 print sh.row_values(i)

 
[u'Roll Number', u'Name', u'Marks', u'Rank']
[1.0, u'Pearson', 86.0, 4.0]
[2.0, u'John', 89.0, 3.0]
[3.0, u'Habib', 64.0, 5.0]
[4.0, u'Venkat', 98.0, 2.0]
[5.0, u'Suri', 100.0, 1.0]
For loop through columns:
>>> for i in range(sh.ncols):
 print sh.col_values(i)

 
[u'Roll Number', 1.0, 2.0, 3.0, 4.0, 5.0]
[u'Name', u'Pearson', u'John', u'Habib', u'Venkat', u'Suri']
[u'Marks', 86.0, 89.0, 64.0, 98.0, 100.0]
[u'Rank', 4.0, 3.0, 5.0, 2.0, 1.0]
Usually I will be using the for loop through rows. Now putting all together in a script, what we have learnt:
import xlrd

#Open the workbook
wb = xlrd.open_workbook('roc.xlsx')

#Open the sheet by index
sh = wb.sheet_by_index(0)

#Read data with for loop
for i in range(sh.nrows):
    print sh.row_values(i)
The output for the above code will be as follows:
>>> 
[u'Roll Number', u'Name', u'Marks', u'Rank']
[1.0, u'Pearson', 86.0, 4.0]
[2.0, u'John', 89.0, 3.0]
[3.0, u'Habib', 64.0, 5.0]
[4.0, u'Venkat', 98.0, 2.0]
[5.0, u'Suri', 100.0, 1.0]

I would like to end this tutorial here because this is a beginner tutorial and I don't want to confuse you by introducing you to more and more methods and classes. If you want to learn more then see the further tutorials in which I will be using a few more xlrd module functions.

As always I have tried to explain each and every thing in this post in such a way that it is easy for everyone to understand. But if you haven't understood anything or have any doubt then please do comment in the comment box below. You can also contact me if you are want to. I will reply to you in both the ways.

Please do comment on how I can improve this tutorial such that it will be useful even for very beginners.
"Let us all together improve by sharing our knowledge."
Thank you. Have a nice day :)

Popular posts from this blog

Making a quiz web app with python and flask

Edit : When you are creating a web app with h tml templates, then y ou will have to sa ve the html file in templates folder in the Current Wor ki ng Directory( CWD). If you save the file in the C W D directl y you will get a TemplateNotFound error. Thank you Udhay for pointing it out.   In this post we will create a quiz website using python . I will be using the flask framework . After reading this tutorial you will learn form submission , flask templates , python code in flask templates , shuffling the questions and options with the random module and few others.  Please note that this tutorial is not big as it seems to be. Some of the code has been rewritten to maintain consistency and also font size is somewhat big so that your eyes won't get stressed reading this tutorial. Also the content has not occupied the full width of the page. In this tutorial I am assuming that you are having a very basic understanding of the flask framework . Please refer the documentation

Problem 11 Project Euler Solution with python

Largest product in a grid In the 20×20 grid below, four numbers along a diagonal line have been marked in red. 08 02 22 97 38 15 00 40 00 75 04 05 07 78 52 12 50 77 91 08 49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 04 56 62 00 81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 03 49 13 36 65 52 70 95 23 04 60 11 42 69 24 68 56 01 32 56 71 37 02 36 91 22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80 24 47 32 60 99 03 45 02 44 75 33 53 78 36 84 20 35 17 12 50 32 98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70 67 26 20 68 02 62 12 20 95 63 94 39 63 08 40 91 66 49 94 21 24 55 58 05 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72 21 36 23 09 75 00 76 44 20 45 35 14 00 61 33 97 34 31 33 95 78 17 53 28 22 75 31 67 15 94 03 80 04 62 16 14 09 53 56 92 16 39 05 42 96 35 31 47 55 58 88 24 00 17 54 24 36 29 85 57 86 56 00 48 35 71 89 07 05 44 44 37 44 60 21 58 51 54 17 58 19 80 81 68 05 94 47 69 28 73 92 13 86 52 17 77 04 89 55 40 04 52 08 83 97 35 99 16 07

Problem 60 Project Euler Solution with python

Prime pair sets The primes 3, 7, 109, and 673, are quite remarkable. By taking any two primes and concatenating them in any order the result will always be prime. For example, taking 7 and 109, both 7109 and 1097 are prime. The sum of these four primes, 792, represents the lowest sum for a set of four primes with this property. Find the lowest sum for a set of five primes for which any two primes concatenate to produce another prime. This problem is j u st a brute force problem. If you have come here because you don't know the limit upto which you will h ave to gener ate the prime numbers t hen go ahe ad and t r y with 10,000 . When I first start ed solving the problem I chose 1 million(beca use most of the problem s on project E uler have this limit ), but it took very long for the computer to fin d the solution. After searching on the internet then I found many people choosing 10, 000 so I have changed my in put f rom 1 million to 10000 and the output was f ast. He