In SQL, what’s the difference between the having clause and the group by statement?
In SQL, the having clause and the group by statement work together when using aggregate functions like SUM, AVG, MAX, etc. This is best illustrated by an example. Suppose we have a table called emp_bonus as shown below. Note that the table has multiple entries for employees A and B – which means that both employees A and B have received multiple bonuses.
If we want to calculate the total bonus amount that each employee has received, then we would write a SQL statement like this:
The Group By Clause
In the SQL statement above, you can see that we use the "group by" clause with the employee column. The group by clause allows us to find the sum of the bonuses for each employee – because each employee is treated as his or her very own group. Using the ‘group by’ in combination with the ‘sum(bonus)’ statement will give us the sum of all the bonuses for employees A, B, and C.
Running the SQL above would return this:
Now, suppose we wanted to find the employees who received more than $1,000 in bonuses for the year of 2012 – this is assuming of course that the emp_bonus table contains bonuses only for the year of 2012. This is when we need to use the HAVING clause to add the additional check to see if the sum of bonuses is greater than $1,000, and this is what the SQL look like:
And the result of running the SQL above would be this:
Difference between having clause and group by statement
So, from the example above, we can see that the group by clause is used to group column(s) so that aggregates (like SUM, MAX, etc) can be used to find the necessary information. The having clause is used with the group by clause when comparisons need to be made with those aggregate functions – like to see if the SUM is greater than 1,000, as in our example above. So, the having clause and group by statements are not really alternatives to each other – but they are used alongside one another!