How to fetch specific columns by Pandas read_excel method
- Python Pandas library has a read_excel method to load the Workbook in Excel.
- By default, it fetches the first sheet and all its data.
- By using usecols parameter, you may specify which columns to get data from the specified sheet.
- Also, you may specify ranges like B:E, or even scattered A,C,E:F.
- In this tutorial, we will show you how to pick up one or more columns among four columns from our sample sheet.
An example of fetching and displaying the Product Name column
This is a sample sheet that we will work on to show the examples of usecols parameter in read_excel method of Pandas:
Fetching the “Product Name” column only.
Code:
import pandas as pd_cols #Return only specific columns dt_cols = pd_cols.read_excel('test_Excel.xlsx', index_col=0, usecols=['Product Name']) #Display columns data print(dt_cols)
Output:
Empty DataFrame
Columns: []
Index: [Wheat, Rice, Sugar, Maze, Tea Set, Dinner Set, Rice, Plates, Tea Spoon, Table Spoon]
Fetch and display two columns example
In this example, we will fetch the Product Name and Status columns from our sample sheet.
Code:
import pandas as pd_cols # Return only specific columns dt_cols = pd_cols.read_excel('test_Excel.xlsx', usecols=['Product Name', 'Status']) #Display columns data print(dt_cols)
Output:
This time it also added an index column as we omitted index_col=0 parameter.
Referring column number example
You may also pass column numbers that are a list of int type to fetch specific columns:
In that case, it’s 0 based index that is:
- 0 = Product ID
- 1 = Product name
- 2= Price
- 3 = Status
The program below returns columns 1, 2, and 3 from our sample sheet:
import pandas as pd_cols #Return only columns by 0-based index dt_cols = pd_cols.read_excel('test_Excel.xlsx',index_col=0, usecols=[1,2,3]) #Display columns data print(dt_cols)
Output:
Using column headers like A, B, C example
You may also specify column header A, B, C, or column range A:E, etc. in the usecols parameter.
See an example below:
import pandas as pd_cols #Return columns range dt_cols = pd_cols.read_excel('test_Excel.xlsx',index_col=0, usecols="B:D") #Display columns data print(dt_cols)
Output:
Or non-adjacent range:
import pandas as pd_cols #Return columns range dt_cols = pd_cols.read_excel('test_Excel.xlsx',index_col=0, usecols="A:B,D") #Display columns data print(dt_cols)
Output: