Accessing Excel Data by using openpyxl
In previous tutorials, we learned how to create a Workbook in Excel by using openpyxl. We also learned to create Sheets and give them custom names.
In this tutorial, you are going to learn how to access Excel data by using openpyxl – a Python library for working with Excel.
Accessing Workbook and Sheet Example
In order to access data of the specific sheet, you first need to load the Workbook and specify the active sheet. This is important because if your Workbook has more than one sheet then how to tell which one to access data from?
For our examples, we have the following Workbook:
- Worksheet name: test_Excel.xlsx
- Sheet 1 Name: Sales
- Sheet 2 Name: Product Information
Image of Workbook:
This is how you may access the Workbook and then access the active sheet name:
The code:
# Accessing Excel Data # excel-python.py from openpyxl import load_workbook #Loading required workbook wb_access = load_workbook('test_Excel.xlsx') #Accessing Active Worksheet sheet = wb_access.active print (sheet)
Output:
You can see in the above image, that “Product Information” is our second sheet, however, as it is active so this is returned as the output.
Accessing all sheet names
Before accessing the specific cell data, let us first look at how to access all sheet names in the loaded Excel Workbook.
For that, simply use the sheetnames method as follows:
The code:
# Accessing Excel Data # excel-python.py from openpyxl import load_workbook #Loading required workbook wb_access = load_workbook('test_Excel.xlsx') #Accessing All Sheet names print (wb_access.sheetnames)
Output:
You can see, it matches the names in our sample Workbook above.
Accessing a specific cell data example
Now let us look at how to access specific cell(s) data in our specified sheet.
First, let’s access the A1 Cell data from our “Product Information” Sheet. Look at the code below:
The Code:
# Accessing Excel Data # excel-python.py from openpyxl import load_workbook wb_access = load_workbook('test_Excel.xlsx') sheet = wb_access.active print (sheet) print (sheet.title) print (sheet["A1"]) #Accessing A1 Cell Data print (sheet["A1"].value)
Output:
<Worksheet “Product Information”>
Product Information
<Cell ‘Product Information’.A1>
Product ID
So, to just access the value of the specific cell, you need to use the value i.e.
print (sheet[“A1”].value)
and it returned “Product ID” which is our first header in the “Product Information” worksheet, as shown below:
If you only specified cell name without value i.e.
print (sheet[“A1”])
It returned <Cell ‘Product Information’.A1> i.e. the main cell object rather than the cell value.
Using row and column numbers to access data
Rather than using A1, A2, B1, B2 etc. cell names, you may also provide row and column numbers to access cell data.
See the example below where we accessed B6 cell value. This is “Tea Set” and its column and row number are:
row=6,
column=2
Let us access the value in both ways and see the result:
Code:
# Accessing Excel Data # excel-python.py from openpyxl import load_workbook wb_access = load_workbook('test_Excel.xlsx') sheet = wb_access.active #Accessing A1 Cell Data by cell print (sheet["B6"].value) #Accessing A1 Cell Data by row and col numbers print (sheet.cell(row=6, column=2).value)
Output:
Tea Set
Tea Set
Now you can compare the result with our sample sheet below:
So the result is the same in both cases – you may use any of the ways to access data.
Accessing all header names example
The following example shows how to access all headers in our example Workbook. As you have seen above, our Product Information sheet contains four headers.
In the example below, we will access all four header names:
# Accessing Excel Data # excel-python.py from openpyxl import load_workbook wb_access = load_workbook('test_Excel.xlsx') sheet = wb_access.active for row in sheet.iter_rows(min_row=1, max_col=4, max_row=1, values_only=True): print(row)
Output:
If you omit the last values_only=True see the output:
The code:
from openpyxl import load_workbook wb_access = load_workbook('test_Excel.xlsx') sheet = wb_access.active for row in sheet.iter_rows(min_row=1, max_col=4, max_row=1): print(row)
Output:
So let us first look at what all these are in our above code:
- iter_rows: is the method of Worksheet that is used to iterate a row.
- min_row: is its parameter to tell the minimum row number. In our example, it means row 1.
- max_row: is also self-explanatory. We only wanted to fetch headers, so we gave 1
- max_col: is to specify up to which column we need to get data. As we have four columns, so we specified 4.
Fetching more rows with values example
As we have total of 11 rows in our sample Excel sheet, so we will give max_row=11 (as such index starts at 1).
So, it will return all data from our sample sheet. See the code and output below:
# Accessing Excel Data # excel-python.py from openpyxl import load_workbook wb_access = load_workbook('test_Excel.xlsx') sheet = wb_access.active #Fetching all data for row in sheet.iter_rows(min_row=1, max_col=4, max_row=11, values_only=True): print(row)
Output:
(‘Product ID’, ‘Product Name’, ‘Price’, ‘Status’)
(‘p-001’, ‘Wheat’, 3.5, ‘In Stock’)
(‘p-002’, ‘Rice’, 5.5, ‘In Stock’)
(‘p-003’, ‘Sugar’, 2, ‘In Stock’)
(‘p-004’, ‘Maze’, 1.5, ‘In Stock’)
(‘p-005’, ‘Tea Set’, 5.55, ‘In Stock’)
(‘p-006’, ‘Dinner Set’, 25.99, ‘Out of Stock’)
(‘p-007’, ‘Rice’, 1.2, ‘Out of Stock’)
(‘p-008’, ‘Plates’, 7, ‘In Stock’)
(‘p-009’, ‘Tea Spoon’, 8, ‘Out of Stock’)
(‘p-0010’, ‘Table Spoon’, 10, ‘In Stock’)
Iterating all rows
If you don’t know how much data an Excel sheet has, you may want to iterate through all rows in the sheet.
This is also a shortcut of iter_rows to get the whole dataset:
from openpyxl import load_workbook wb_access = load_workbook('test_Excel.xlsx') sheet = wb_access.active for row in sheet.rows: print(row)
Output:
(<Cell ‘Product Information’.A1>, <Cell ‘Product Information’.B1>, <Cell ‘Product Information’.C1>, <Cell ‘Product Information’.D1>)
(<Cell ‘Product Information’.A2>, <Cell ‘Product Information’.B2>, <Cell ‘Product Information’.C2>, <Cell ‘Product Information’.D2>)
(<Cell ‘Product Information’.A3>, <Cell ‘Product Information’.B3>, <Cell ‘Product Information’.C3>, <Cell ‘Product Information’.D3>)
(<Cell ‘Product Information’.A4>, <Cell ‘Product Information’.B4>, <Cell ‘Product Information’.C4>, <Cell ‘Product Information’.D4>)
(<Cell ‘Product Information’.A5>, <Cell ‘Product Information’.B5>, <Cell ‘Product Information’.C5>, <Cell ‘Product Information’.D5>)
(<Cell ‘Product Information’.A6>, <Cell ‘Product Information’.B6>, <Cell ‘Product Information’.C6>, <Cell ‘Product Information’.D6>)
(<Cell ‘Product Information’.A7>, <Cell ‘Product Information’.B7>, <Cell ‘Product Information’.C7>, <Cell ‘Product Information’.D7>)
(<Cell ‘Product Information’.A8>, <Cell ‘Product Information’.B8>, <Cell ‘Product Information’.C8>, <Cell ‘Product Information’.D8>)
(<Cell ‘Product Information’.A9>, <Cell ‘Product Information’.B9>, <Cell ‘Product Information’.C9>, <Cell ‘Product Information’.D9>)
(<Cell ‘Product Information’.A10>, <Cell ‘Product Information’.B10>, <Cell ‘Product Information’.C10>, <Cell ‘Product Information’.D10>)
(<Cell ‘Product Information’.A11>, <Cell ‘Product Information’.B11>, <Cell ‘Product Information’.C11>, <Cell ‘Product Information’.D11>)
(<Cell ‘Product Information’.A12>, <Cell ‘Product Information’.B12>, <Cell ‘Product Information’.C12>, <Cell ‘Product Information’.D12>)