SQL aggregation is the task of collecting a set of values to return a single value. It is done with the help of aggregate functions, such as SUM, COUNT, and AVG. For example, in a database of products, you might want to calculate the average price of the whole inventory.
How is it used?
Aggregation in SQL is, typically, used in conjunction with grouping. The Group By clause is used to arrange rows into groups in SQL. Aggregation, together with grouping, is key to generating quick reports and insights from a database. For example, an ecommerce company might want to see its highest spending customers over a given time period.
Syntax for Aggregation and Grouping
The syntax for GROUP BY clause is:
GROUP BY ColumnName1, ColumnName2;
Here, ColumnName is the name(s) of the column(s) you want to apply the Group By Clause to.
The syntax for aggregation in SQL is:
AggregateFunctionName(DISTINCT or ALL GroupName)
Here, AggregateFunctionName is the name of the aggregate function you want to apply, such as SUM, AVG etc. Within parentheses, you specify if you want to apply the function to a select group of values, or to all of them. If you don't specify anything, SQL considers ALL as default. GroupName is the name of the group you want to apply the aggregate function to.
Example of GROUP BY and Aggregate Functions
Typically, the two functions are used together to summarise a database. Let's consider a simple example of database of voters. The database is called Voter_List. For each voter, Voter_List has the following columns:
- Voter_ID
- City_Name
- State_Name
- Language_Spoken
For our example, let us count the number of voters in each city. To do so, we will run the following SQL command:
SELECT City_Name
COUNT (Voter_ID) Voter_Count
FROM Voter_List
GROUP BY City_Name
ORDER BY City_Name;
The command will first group unique voter IDs for each city, and then count the IDs to give us the number of voters in each city.
Commonly Used SQL Aggregate Functions
The following are some of the most commonly used SQL aggregate functions:
- AVG: This calculates the average of all values in a group.
- MIN: It returns the lowest value in a group.
- MAX: MAX aggregate function in SQL returns the largest value in a group
- COUNT: It is used to count the number of rows in a set. the COUNT function includes rows with NULL values.
- SUM: This is used to calculate the sum of all non-NULL values in a group
It is important to note here that all aggregate functions ignore NULL values in a set or group, except the COUNT function.
Use of HAVING Clause
In a lot of cases, we might not want to apply the GROUP BY clause on the entire data. For instance, we might want to calculate the average order value for a particular customer. In such cases, the GROUP BY command is used together with the conditional HAVING clause.
The syntax for HAVING Clause is:
HAVING ColumnName = Value
where ColumnName is the name of the column that the GROUP BY function is applied on, and Value is the particular value you want the results for. In our example of a customer database, where each customer has a unique numerical ID, Value will be the ID of the customer you want to calculate the average order value for.