GROUP BY clause in SQL

Explore the power of grouping rows and performing aggregate calculations with our comprehensive infographic.

What is GROUP BY clause in SQL?

The GROUP BY is a clause in SQL that is used with the SELECT statement for grouping the rows. The purpose is generally performing aggregate operations like SUM, COUNT, AVG, etc. to the group data.

--Simple GROUP BY
SELECT emp_name, SUM(emp_sal_paid) As "Total Paid"
FROM sto_emp_salary_paid
GROUP BY emp_name

--With HAVING clause
SELECT emp_name, SUM(emp_sal_paid) As "Total Paid"
FROM sto_emp_salary_paid
GROUP BY emp_name
HAVING SUM(emp_sal_paid) >= 5000

A few important points about the GROUP BY clause:

  • The SQL GROUP BY is used with the select statement
  • If WHERE clause is used in the SELECT statement then GROUP BY is used after it.
  • The GROUP BY clause is used before the ORDER BY clause
  • The GROUP BY clause is often used with the aggregate functions.
  • The column(s) used for grouping data must appear in the SELECT statement.
  • You may use one or more columns in the GROUP BY clause for grouping data.

Syntax of using GROUP BY clause

SELECT column1, column2, ..., aggregate_function(column)
FROM table_name
WHERE condition
GROUP BY column1, column2, ...;

In the syntax:

Parameters Description
column1, column2,… Specify the columns there (separated by a comma) for which you want to group the data.
aggregate_function(column) The aggregate function (e.g., COUNT, SUM, AVG, MIN, MAX) is applied to each group.
table_name The name of the table from which to retrieve data.
WHERE Optional. Specifies the condition for filtering the rows before grouping.

A simple example of GROUP BY clause

For the GROUP BY examples, I will use the sto_emp_salary_paid table that stores the employees’ salaries paid every month. You can see the demo table with data in the graphic below.

The first query simply displays the complete table data and the second query uses the SQL GROUP BY query to get the sum of salary paid to each employee:

Query:

SELECT emp_name, SUM(emp_sal_paid) As "Total Paid"

FROM sto_emp_salary_paid

GROUP BY emp_name

SQL GROUP BY

Using WHERE clause with GROUP BY query

In this example, we set the criteria by using the WHERE clause in the above SELECT statement.

So, we will get the SUM of salaries for the employees who got paid 3500+. The records with less than this value will not be included in the SUM function.

See the query and table output and you may compare the result with the above output:

SQL query:

SELECT emp_name, SUM(emp_sal_paid) As "Total Paid"

FROM sto_emp_salary_paid

WHERE emp_sal_paid > 3500

GROUP BY emp_name

SQL GROUP BY WHERE

As mentioned earlier, the GROUP BY clause comes after the Where clause as shown in the above example.

Note: You may execute these example queries in MS SQL server and MySQL server as well. Just adjust a little bit.

Using the HAVING clause in GROUP BY

Let us say we require the records of those employees whose total paid amount is greater than or equal to 5000.

For that, we may calculate the SUM of each employee and then use a criterion in the SQL HAVING clause.

Query:

SELECT emp_name, SUM(emp_sal_paid) As "Total Paid"

FROM sto_emp_salary_paid

GROUP BY emp_name

HAVING SUM(emp_sal_paid) >= 5000

SQL GROUP BY HAVING

Did you know? The HAVING clause can only be used with the GROUP BY. Also, you may use an aggregate function in the HAVING clause. If you try using an aggregate function like SUM in the WHERE clause, it produces an error.

The example of using an ORDER BY clause with GROUP BY

The ORDER BY clause is used to sort the results in ascending or descending order. You may also sort the results fetched after using the SELECT with GROUP BY clause.

The ORDER BY clause is used after the GROUP BY as shown in the query below:

The query:

SELECT SUM(emp_sal_paid) As "Total Salary", emp_name

FROM sto_emp_salary_paid

GROUP BY emp_name

ORDER BY emp_name;

SQL GROUP BY ORDER

The example of using COUNT and GROUP BY clause

The above examples show using the SUM function with GROUP BY clause. As mentioned earlier, you may use any aggregate function with GROUP BY including the COUNT function.

The following query shows using the COUNT for getting the number of salaries paid to each employee in our example table. Also, I used the ORDER BY clause with the DESC keyword in the same query:

Query:

SELECT COUNT(*) As "No of Salaries", emp_name

FROM sto_emp_salary_paid

GROUP BY emp_name

ORDER BY emp_name DESC;

SQL GROUP BY COUNT

You may learn more about the SQL COUNT function in its tutorial.

Using MAX function with GROUP BY example

Let us extend the above example scenario while using the MAX function with the GROUP BY clause. The query will return the total number of salaries for each employee along with the maximum amount paid.

The MAX function is used in the same query; so you may use multiple aggregate functions as well.

Also, for the demo only, I used the ORDER BY clause with two aggregate columns; one with default ASC order and the other with DESC order:

SELECT MAX(emp_sal_paid) As "Maximum Salary",COUNT(*) As "Total Salaries", emp_name

FROM sto_emp_salary_paid

GROUP BY emp_name

ORDER BY MAX(emp_sal_paid), COUNT(*) DESC;

The resultant recordset:

SQL GROUP BY MAX

Using MIN function example

Just a little variation in the above query and this example shows using the MIN function with GROUP BY clause:

The Query:

SELECT MIN(emp_sal_paid) As "Minimum Salary",COUNT(*) As "Total Salaries", emp_name

FROM sto_emp_salary_paid

GROUP BY emp_name

ORDER BY MIN(emp_sal_paid) DESC, COUNT(*) ASC;

SQL GROUP BY MIN

The example of using two columns in GROUP BY

For explaining more than one column usage in the GROUP BY clause, I am going to use the Sales table. The sto_sales table stores the sales based on countries and regions (for the demo only).

In the SELECT – GROUP BY statement, we will get the sum of sales on the basis of:

  • Country
  • Regions

The query should sum the record where it finds the same Country and Regions, unlike the above examples where data were grouped on the basis of a single column. Have a look:

The two columns query:

SELECT Country_of_Sale, Region_of_Sale,SUM(No_of_Sales)

FROM sto_sales

GROUP BY Country_of_Sale,Region_of_Sale;

The output:

SQL GROUP BY Multi-Columns

The example of using GROUP BY ROLLUP

The following example shows using the GROUP BY ROLLUP where two columns are specified. By looking at the complete data and records returned by GROUP BY ROLLUP, you will understand how it works:

The Query:

SELECT Country_of_Sale, Region_of_Sale,SUM(No_of_Sales)

FROM sto_sales

GROUP BY ROLLUP (Country_of_Sale,Region_of_Sale);

SQL GROUP BY ROLLUP

Using GROUP BY CUBE example

The GROUP BY CUBE is used to create groups of all possible combinations of specified columns. This is what the official statement says:

“For GROUP BY CUBE (a, b) the results has groups for unique values of (a, b), (NULL, b), (a, NULL), and (NULL, NULL).”

Have a look at the resultset returned after using it in our demo sales table:

Query:

SELECT Country_of_Sale, Region_of_Sale,SUM(No_of_Sales) As "Sales Total"

FROM sto_sales

GROUP BY CUBE (Country_of_Sale,Region_of_Sale);

The output:

SQL GROUP BY CUBE

Author - Atiq Zia

Atiq is the writer at jquery-az.com, an online tutorial website started in 2014. With a passion for coding and solutions, I navigate through various languages and frameworks. Follow along as we unravel the mysteries of coding together!