The GROUP BY clause in SQL organizes rows into groups based on specified columns and applies aggregate functions to each group. When you use GROUP BY, the database engine sorts the result set into subsets where the values in the listed columns match. Aggregate functions like COUNT(), SUM(), AVG(), MAX(), or MIN() then compute a single value for each group. For example, if you group sales data by a “department” column, you can calculate the total sales per department. The output will contain one row per unique department, alongside the aggregated results for that group.
To illustrate, consider a table named Sales with columns Department, Employee, and SaleAmount. If you run SELECT Department, SUM(SaleAmount) FROM Sales GROUP BY Department
, the query will sum all SaleAmount values for rows sharing the same Department. Each department becomes a group, and the sum is computed within each group. You can also group by multiple columns, such as GROUP BY Department, Year
, to create subgroups (e.g., total sales per department per year). Importantly, every column in the SELECT clause that isn’t part of an aggregate function must be included in the GROUP BY clause. For instance, SELECT Department, Employee, SUM(SaleAmount)
would require GROUP BY Department, Employee
to avoid errors.
GROUP BY is commonly used for reporting or summarizing data, like calculating monthly revenue or counting user activity by region. A key consideration is filtering: the WHERE clause filters rows before grouping, while HAVING filters groups after aggregation. For example, HAVING SUM(SaleAmount) > 1000
returns only departments with total sales exceeding 1000. A common mistake is omitting non-aggregated columns from the GROUP BY clause, which causes errors in strict SQL modes. Additionally, grouping on columns with high cardinality (e.g., unique IDs) can lead to inefficient queries, as it creates many small groups. Proper indexing and thoughtful column selection help optimize performance.
Zilliz Cloud is a managed vector database built on Milvus perfect for building GenAI applications.
Try FreeLike the article? Spread the word