The GROUP BY clause in SQL

The GROUP BY is a clause in SQL that is used with the SELECT statement for grouping the rows. The purpose is generally performing the aggregate operation like SUM, COUNT, AVG etc. to the group data.

A few important points about GROUP BY clause:

  • The SQL GROUP BY is used with the select statement
  • If WHERE clause is used in the SELECT statement then GROUP BY is used after it.
  • The GROUP BY clause is used before the ORDER BY clause
  • The GROUP BY clause is often used with the aggregate functions.
  • The column(s) used for grouping data must appear in the SELECT statement.
  • You may use one or more columns in the GROUP BY clause for grouping data.

A simple example of GROUP BY clause

For the GROUP BY examples, I will use the sto_emp_salary_paid table that stores the employee’s salaries paid every month. You can see the demo table with data in the graphic below.

The first query simply displays the complete table data and the second query uses the GROUP BY SQL query for getting the sum of salary paid to each employee:

Query:

SQL GROUP BY

Using Where clause with GROUP BY query

In this example, I set the criteria by using the where clause in above SELECT statement. So, we will get the SUM of salaries for the employees who got paid 3500+. The records with less than this value will not be included in the SUM function. See the query and table output and you may compare the result with above output:

SQL query:

SQL GROUP BY WHERE

As mentioned earlier, the GROUP BY clause comes after the Where clause as shown in above example.

Note: You may execute these example queries in MS SQL server and MySQL server as well. Just adjust a little bit.

Using HAVING clause in GROUP BY

Let us say we require the records of those employees whom total paid amount is greater than or equal to 5000.

For that, we may calculate the SUM of each employee and then use a criterion in HAVING clause.

Query:

SQL GROUP BY HAVING

Did you know? The HAVING clause can only be used with the GROUP BY. Also, you may use an aggregate function in the HAVING clause. If you try using an aggregate function like SUM in the WHERE clause, it produces an error.

The example of using ORDER BY clause with GROUP BY

The ORDER BY clause is used to sort the results in ascending or descending order. You may also sort the results fetched after using the SELECT with GROUP BY clause.

The ORDER BY clause is used after the GROUP BY as shown in the query below:

The query:

SQL GROUP BY ORDER

The example of using COUNT and GROUP BY clause

The above examples show using the SUM function with GROUP BY clause. As mentioned earlier, you may use any aggregate function with GROUP BY including the COUNT function.

The following query shows using the COUNT for getting the number of salaries paid to each employee in our example table. Also, I used the ORDER BY clause with DESC keyword in the same query:

Query:

SQL GROUP BY COUNT

You may learn more about the SQL COUNT function in its tutorial.

Using MAX function with GROUP BY example

Let us extend the above example scenario while using the MAX function with GROUP BY clause. The query will return the total number of salaries for each employee along with maximum amount paid.

The MAX function is used in the same query; so you may use multiple aggregate functions as well. Also, for demo only, I used ORDER BY clause with two aggregate columns; one with default ASC order and the other with DESC order:

The resultant record-set:

SQL GROUP BY MAX

Using MIN function example

Just a little variation in the above query and this example shows using the MIN function with GROUP BY clause:

The Query:

SQL GROUP BY MIN

The example of using two columns in GROUP BY

For explaining more than one column usage in the GROUP BY clause, I am going to use the Sales table. The sto_sales table stores the sales based on countries and regions (for the demo only).

In the SELECT – GROUP BY statement, we will get the sum of sales on the basis of:

  • Country
  • Regions

The query should sum the record where it finds the same Country and Regions, unlike the above examples where data were grouped on the basis of the single column. Have a look:

The two columns query:

The output:

SQL GROUP BY Multi-Columns

The example of using GROUP BY ROLLUP

The following example shows using the GROUP BY ROLLUP where two columns are specified. By looking at the complete data and records returned by GROUP BY ROLLUP, you will understand how it works:

The Query:

SQL GROUP BY ROLLUP

Using GROUP BY CUBE example

The GROUP BY CUBE is used to create groups of all possible combination of specified columns. This is what official statement says:

“For GROUP BY CUBE (a, b) the results has groups for unique values of (a, b), (NULL, b), (a, NULL), and (NULL, NULL).”

Have a look at the result-set returned after using it in our demo sales table:

Query:

The output:

SQL GROUP BY CUBE