In the previous post we have seen, how to install xlrd module. If you already have xlrd module installed on your computer then go ahead with this section other wise install the module having a look from here: The xlrd python module for reading excel spreadsheets - Installation.
Eager to get started?This post will give you a good introduction to get started with the xlrd.
First make sure that
A cell is a box which contains only one entry or a single piece of data.
To find the data types of the row:
To get the list of merged cells in the spreadsheet:
As always I have tried to write each and every part in this post so that it is easy for everyone to understand. But if you have any doubt or didn't understand anything then please do comment in the comment box below. I will reply to you as soon as possible. You can contact me if you want to from here: Contact me
Also please do comment on how I can improve this post so that it will be helpful even for beginners. Please don't hesitate to comment if I have made any typo or you want me to add anything to this post.
Eager to get started?This post will give you a good introduction to get started with the xlrd.
First make sure that
- xlrd is installed on your computer.
$ pip freeze
. Check ifxlrd
is present in the list. - xlrd is up-to-date. In IDLE
xlrd.info.__VERSION__
to see the version and check if you are up-to-date.
Lets get started with something simple.
Terminology
Excel files are also called as workbooks or simply books. Also there can be n number of sheets in a given workbook.
Sheets in Excel File or workbook |
Cell in Excel Worksheet |
Column in excel worksheet |
Working with workbook
Lets start with a small python script(script#1).
If you will run the above script then you will get the following output:
So what actually happened in the code?
import xlrd #Opening the Workbook workbook = xlrd.open_workbook('roc.xlsx') #Opening the workbook based on the sheet name sheet = workbook.sheet_by_name('Sheet1') #Reading the value of a cell #sheet.cell_value(row,col) cell_2_3 = sheet.cell_value(2,3) #Print the cell value print(cell_2_3)
3.0
So what actually happened in the code?
- First we have imported the
xlrd
module. - We have opened our workbook using
open_workbook
. - We then opened the sheet in the workbook using its name with the help of
sheet_by_name
- After we have opened the sheet we have accessed the value of the cell in row 2 and column 3 using
sheet.col_value(rowx,coly)
- Finally we have printed the value of the variable
cell_2_3
.
As simple as it looks.
Now we will see some functions in details so that by the end of this post you will be comfortable to program with xlrd.
Now we will see some functions in details so that by the end of this post you will be comfortable to program with xlrd.
open_workbook
When you will call this function with theon_demand
argument then the user will save time and memory by only loading the sheets which are required and releasing the sheets that are not required.- By default
on_demand = False
. - If you will use
on_demand = True
with BIFF <5.0 then you will get an error message andon_demand
is changed to the defaultFalse
- If you will use
on_demand = True
with BIFF version ≥ 5.0 then all the global data is loaded and when the users requests a required sheets, the required sheet is opened and all the other sheets are released, thus saving memory and time.
open_workbook
function as follows:
xlrd.open_workbook(filename='yourfilename.xls', ondemand=True)
An another important argument you need to know is the
Lets dig deep... Open your interpreter and we will see few other important things that we can do with the book object that we have created earlier. Let's start again from scratch.
Opening the workbook
To see who created the workbook(or the author of the workbook):
To see the number of sheets the workbook posses:
To see the names of the sheets:
To open the sheet based on the name of the sheet:
Sheets in excel sheets are indexed as 0,1,2,3... corresponding to first sheet, second sheet, third sheet... respectively. So to open the sheet based on the index:
To check if the sheet is loaded:
To Unload the sheet which we have loaded or opened:
Even though we have loaded the workbook all our data is stored in the form of sheets and thus we will work on sheets now.
But unlike just calling the function we will store the value returned by the function
See the name of the sheet:
Check if the sheet is visible:
1 means hidden
Find the number of columns(Total number of vertical columns in the sheet):
Find the number of rows in the sheet(Total number of horizontal rows in the sheet):
To get the length of a given row:
Get the cell value:
Get the cell value including the data type:
You can also access the value of the cell using
To find the data type of the cell:
To get all the column values:
Get the values of the column after slicing:
To get the column values along with the data type:
Find the data types of the column:
To get the row values:
To get the row values along with the data type:
Get the values of the row after slicing:
encoding_override
. If you have read the introduction, then we have discussed that xlrd
by default uses Unicode encoding. But when your excel file is not having the Unicode encoding then xlrd
will try to convert the encoding your excel file has to Unicode. But sometimes python(xlrd) will not be able to convert all the encoding to Unicode and it will raise an error. At that time you will have to add this argument to the open_workbook
function and you will be good to go.book = xlrd.open_workbook(filename='yourfilename.xls',encoding_override="cp1252")
Lets dig deep... Open your interpreter and we will see few other important things that we can do with the book object that we have created earlier. Let's start again from scratch.
>>> import xlrd
Opening the workbook
>>> book = xlrd.open_workbook(filename='roc.xlsx', on_demand=True)
To see who created the workbook(or the author of the workbook):
>>> book.user_name
To see the number of sheets the workbook posses:
>>> book.nsheets
To see the names of the sheets:
>>> book.sheet_names()
To open the sheet based on the name of the sheet:
>>> book.sheet_by_name('Name of sheet')
Sheets in excel sheets are indexed as 0,1,2,3... corresponding to first sheet, second sheet, third sheet... respectively. So to open the sheet based on the index:
>>> book.sheet_by_index(0/1/2....)
To check if the sheet is loaded:
>>> book.sheet_loaded(0/1/2..) #Check the loaded sheet by index >>> book.sheet_loaded('Sheet name') #Check the loaded sheet by its name
To Unload the sheet which we have loaded or opened:
>>> book.unload_sheet(0/1/2...) #Based on index >>> book.unload_sheet('Sheet name') #Based on its name
Even though we have loaded the workbook all our data is stored in the form of sheets and thus we will work on sheets now.
But unlike just calling the function we will store the value returned by the function
sheet_by_index
or sheet_by_name
after loading the sheet. See the following:>>> sheet = book.sheet_by_index(0) #Enter the index of the sheetor
>>> sheet = book.sheet_by_name('Sheet1') #Enter the name of the sheet
See the name of the sheet:
>>> sheet.name
Check if the sheet is visible:
>>> sheet.visibility0 means visible
1 means hidden
Find the number of columns(Total number of vertical columns in the sheet):
>>> sheet.ncols
Find the number of rows in the sheet(Total number of horizontal rows in the sheet):
>>> sheet.nrows
To get the length of a given row:
>>> sheet.row_len(row_number)
Get the cell value:
>>> sheet.cell_value(row_number, column_number)
Get the cell value including the data type:
>>> sheet.cell(row_num, col_num)
You can also access the value of the cell using
sheet.cell
method as follows:
>>> sheet.cell(row_num,col_num).value
To find the data type of the cell:
>>> sheet.cell_type(row_num,col_num)But if you will call the above function you will only get numbers which correspond to the following table:
Number | Data type |
---|---|
0 | Empty String |
1 | String which is not empty |
2 | float - number |
3 | Date as a float number |
4 | Boolean Where 1 - True; 0 - False |
5 | int - number |
To get all the column values:
>>> sheet.col_values(col_num)
Get the values of the column after slicing:
>>> sheet.col(col_num)
To get the column values along with the data type:
>>> sheet.col_slice(col_num, start_rowx=some-value, end_rowx=some-value)If the values of
start_row
and end_row
are not given the default values are assumed. The default values are start_rowx=0
, end_rowx=None
. So if you will call the function without passing all the parameters, then you will get all the values of the column.
>>> sheet.col_slice(col_num) #works sames as sheet.col_values(col_num)
Find the data types of the column:
>>> sheet.col_types(col_num)Running the above code will give you a list with numbers. See table 1 to find the data type of each and every element in the column.
To get the row values:
>>> sheet.row_values(row_num)
To get the row values along with the data type:
>>> sheet.row(row_num)
Get the values of the row after slicing:
>>> sheet.row_slice(row_num)
row_slice
works similar to col_slice
. But here the optional parameters are start_colx
and end_colx
which by default are equal to 0
and None
respectively.To find the data types of the row:
>>> sheet.row_types(row_num)
To get the list of merged cells in the spreadsheet:
>>> sheet.merged_cellsAccording to official documentation this will only work if
open_workbook()
which we call to open the workbook, is called with parameter formatting_info=True
. But when I tried it without passing the argument value to True
, it worked! Summary
We have seen all the basic functions and operations which we can do with the xlrd module. Using xlrd we can also go further and get all the UI info of the sheet. If you want me to add about the xf class for getting all the UI information, please do comment in the comment box below and I will be glad to add it in this post.As always I have tried to write each and every part in this post so that it is easy for everyone to understand. But if you have any doubt or didn't understand anything then please do comment in the comment box below. I will reply to you as soon as possible. You can contact me if you want to from here: Contact me
Also please do comment on how I can improve this post so that it will be helpful even for beginners. Please don't hesitate to comment if I have made any typo or you want me to add anything to this post.
"Gaining knowledge is the first step to wisdom,Thank you, have a nice day :)
Sharing knowledge is the first step to humanity. "
References: