Sorting Pandas Data Frame by sort_values method

How to Sort Data Frames in Pandas

Pandas data frame has a sort_values method that is used to sort the data frame’s data in ascending or descending order.

The sorting can be based on column labels.

The sort_values method has different parameters as shown in the syntax below.

Syntax

DataFrame.sort_values(by, *, axis=0, ascending=True, inplace=False, kind=’quicksort’, na_position=’last’, ignore_index=False, key=None)

Let us show you examples of using sort_values with its parameters below.

An example of sorting results by a column label in ascending order

In the Python program below, we created a data frame with three columns and five rows of data.

Then we displayed the data frame in its original form.

This is followed by using the sort_values method where we only specified the column label to sort the results.

Python program:

import pandas as pd

#A list to be used for Data Frame

emp_data = [  ['Emp1', "Mike", 5600],

              ['Emp2', "Michelle", 4750],

              ['Emp3', "Ben", 6500],

              ['Emp4', "Shabee", 3600],

              ['Emp5', "Mina", 3250]

            ]

#Creating data frame

df_emp = pd.DataFrame (emp_data, columns = ['ID', 'Name', 'Salary'])


#Display data frame before sorting

print("DF in Orginal Order")

print(df_emp)


#Display data frame after sorting

print("===================")

print("DF After Sorting")

print(df_emp.sort_values(by='Name'))

Output:

DF in Orginal Order

     ID      Name  Salary

0  Emp1      Mike    5600

1  Emp2  Michelle    4750

2  Emp3       Ben    6500

3  Emp4    Shabee    3600

4  Emp5      Mina    3250

===================

DF After Sorting

     ID      Name  Salary

2  Emp3       Ben    6500

1  Emp2  Michelle    4750

0  Emp1      Mike    5600

4  Emp5      Mina    3250

3  Emp4    Shabee    3600

You can see, the results are sorted by Employee names in the data frame.

Sorting data frame in descending order

As it can be seen in the syntax, the default value for sorting order is ascending i.e.

ascending=True

By using the False value, you may get the data sorted in descending order.

See the example below where we used the same data frame and sorted results in descending order.

Program:

import pandas as pd

#A list to be used for Data Frame

emp_data = [  ['Emp1', "Mike", 5600],

              ['Emp2', "Michelle", 4750],

              ['Emp3', "Ben", 6500],

              ['Emp4', "Shabee", 3600],

              ['Emp5', "Mina", 3250]

            ]

#Creating data frame

df_emp = pd.DataFrame (emp_data, columns = ['ID', 'Name', 'Salary'])

#Display data frame before sorting

print("DF in Orginal Order")

print(df_emp)


#Display data frame after ascending=False

print("===================")

print("DF After Sorting in Descending Order")

print(df_emp.sort_values(by='Name', ascending=False))

Output:

pandas-df-sort

The example of sorting by salary column

The example below sorts the result in descending order based on the salary column:

import pandas as pd

#A list to be used for Data Frame

emp_data = [  ['Emp1', "Mike", 5600],

              ['Emp2', "Michelle", 4750],

              ['Emp3', "Ben", 6500],

              ['Emp4', "Shabee", 3600],

              ['Emp5', "Mina", 3250]

            ]

#Creating data frame

df_emp = pd.DataFrame (emp_data, columns = ['ID', 'Name', 'Salary'])


#Sort results by Highest to loweset salary

print(df_emp.sort_values(by='Salary', ascending=False))

Result:

     ID      Name  Salary

2  Emp3       Ben    6500

0  Emp1      Mike    5600

1  Emp2  Michelle    4750

3  Emp4    Shabee    3600

4  Emp5      Mina    3250

Sort result by two columns example

You may also sort the results by providing two or more columns.

The example below sorts the results by name and salary columns in our example data frame:

import pandas as pd

#A list to be used for Data Frame

emp_data = [  ['Emp1', "Mike", 5600],

              ['Emp2', "Michelle", 4750],

              ['Emp3', "Ben", 6500],

              ['Emp4', "Shabee", 3600],

              ['Emp5', "Mina", 3250]

            ]

#Creating data frame

df_emp = pd.DataFrame (emp_data, columns = ['ID', 'Name', 'Salary'])


#Sort results by multiple columns

print(df_emp.sort_values(by=['Name', 'Salary'], ascending=False))

Output:

     ID      Name  Salary

3  Emp4    Shabee    3600

4  Emp5      Mina    3250

0  Emp1      Mike    5600

1  Emp2  Michelle    4750

2  Emp3       Ben    6500

Using ignore_index parameter example

By default, the ignore_index= False. That means the index column numbers remain in place after sorting the results. In the above example output, you can see 3, 4, 0, 1, 2, and the numbers that are for the original DF.

If you may ignore_index=True then the resulting axis after sorting is labeled as 0,1, 2…

See the difference in the example below:

import pandas as pd

#A list to be used for Data Frame

emp_data = [  ['Emp1', "Mike", 5600],

              ['Emp2', "Michelle", 4750],

              ['Emp3', "Ben", 6500],

              ['Emp4', "Shabee", 3600],

              ['Emp5', "Mina", 3250]

            ]

#Creating data frame

df_emp = pd.DataFrame (emp_data, columns = ['ID', 'Name', 'Salary'])

print("Sort results with ignore_index=False (default value)")

print(df_emp.sort_values(by=['Name', 'Salary'], ignore_index=False))


print("Sort results with ignore_index=True")

print(df_emp.sort_values(by=['Name', 'Salary'], ignore_index=True))

Output:

pandas-df-sort-index

Reference:

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sort_values.html

Author - Atiq Zia

Atiq is the writer at jquery-az.com, an online tutorial website started in 2014. With a passion for coding and solutions, I navigate through various languages and frameworks. Follow along as we solve the mysteries of coding together!