SQL Aggregations & GROUP BY

If you’ve ever needed to count, sum, or average things in SQL and felt like you were missing a piece, GROUP BY is what you were looking for. Aggregations are what turn a table full of raw transactions into the “daily revenue” number your boss actually cares about — and once you understand the GROUP BY / HAVING split, everything else clicks into place.

Aggregate Functions: The Five Essential

Function	Purpose	Example
`COUNT()`	Number of rows	How many orders?
`SUM()`	Add up a column	Total revenue?
`AVG()`	Average value	Average order size?
`MAX()`	Largest value	Highest price?
`MIN()`	Smallest value	Lowest price?

COUNT: Counting Rows

Count All Rows

SELECT COUNT(*) as total_users
FROM users;

Result: 5000 (5000 rows in users table)

Count Non-NULL Values

SELECT COUNT(phone) as users_with_phone
FROM users;

Result: 4200 (4200 users have a phone number, 800 are NULL)

Count Distinct Values

SELECT COUNT(DISTINCT country) as unique_countries
FROM users;

Result: 42 (users are from 42 different countries)

Multiple Counts

SELECT 
    COUNT(*) as total_orders,
    COUNT(customer_id) as orders_with_customer,
    COUNT(DISTINCT customer_id) as unique_customers
FROM orders;

SUM, AVG, MAX, MIN

Simple Aggregation

SELECT 
    SUM(total) as total_revenue,
    AVG(total) as avg_order_size,
    MAX(total) as largest_order,
    MIN(total) as smallest_order
FROM orders;

Result:

total_revenue	avg_order_size	largest_order	smallest_order
5,000,000	125.00	9,999.99	0.01

Arithmetic with Aggregates

SELECT 
    COUNT(*) as num_orders,
    AVG(total) as avg_order,
    AVG(total) * 1.1 as avg_order_plus_10_percent
FROM orders;

GROUP BY: Summarizing by Category

Basic GROUP BY

SELECT category, COUNT(*) as product_count
FROM products
GROUP BY category;

Result:

category	product_count
electronics	1200
books	800
clothing	950

How it works:

Divide all rows into groups by category
For each group, count the rows
Return one row per category

GROUP BY Multiple Columns

SELECT 
    category,
    supplier_id,
    COUNT(*) as product_count,
    AVG(price) as avg_price
FROM products
GROUP BY category, supplier_id;

Result:

category	supplier_id	product_count	avg_price
electronics	1	300	450.00
electronics	2	150	399.99
books	3	200	25.00

HAVING: Filtering Groups

WHERE filters individual rows. HAVING filters groups.

Basic HAVING

SELECT 
    category,
    COUNT(*) as product_count
FROM products
GROUP BY category
HAVING COUNT(*) > 100;

Result: Only categories with more than 100 products

category	product_count
electronics	1200
clothing	950

Key difference:

-- WHERE filters rows BEFORE grouping
WHERE price > 100
 
-- HAVING filters groups AFTER grouping
HAVING COUNT(*) > 5

Multiple HAVING Conditions

SELECT 
    customer_id,
    COUNT(*) as order_count,
    SUM(total) as lifetime_value
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5 
   AND SUM(total) > 1000;

Result: High-value customers (more than 5 orders, total > $1000)

Common Aggregation Patterns

Daily/Weekly/Monthly Totals

SELECT 
    DATE(order_date) as date,
    COUNT(*) as order_count,
    SUM(total) as daily_revenue
FROM orders
GROUP BY DATE(order_date)
ORDER BY date;

Result:

date	order_count	daily_revenue
2026-01-20	145	18,750.00
2026-01-21	152	19,200.00
2026-01-22	138	17,625.00

Top N Analysis

SELECT 
    product_id,
    product_name,
    SUM(quantity) as total_sold,
    SUM(quantity * price) as revenue
FROM order_items
GROUP BY product_id, product_name
ORDER BY revenue DESC
LIMIT 10;

Result: Top 10 products by revenue

Customer Cohorts

SELECT 
    EXTRACT(YEAR FROM created_at) as signup_year,
    EXTRACT(MONTH FROM created_at) as signup_month,
    COUNT(*) as new_users,
    SUM(lifetime_value) as cohort_value
FROM users
GROUP BY signup_year, signup_month
ORDER BY signup_year, signup_month;

Data Quality Checks

-- Find categories with suspiciously low average prices
SELECT 
    category,
    COUNT(*) as product_count,
    AVG(price) as avg_price,
    MIN(price) as min_price,
    MAX(price) as max_price
FROM products
GROUP BY category
HAVING AVG(price) < 10 OR MAX(price) < 100;

Subqueries with GROUP BY

Filter Groups in Subquery

SELECT * FROM (
    SELECT 
        user_id,
        COUNT(*) as order_count,
        SUM(total) as total_spent
    FROM orders
    GROUP BY user_id
) user_stats
WHERE order_count > 10;

Result: Users with more than 10 orders

Rank Groups

WITH category_stats AS (
    SELECT 
        category,
        COUNT(*) as product_count,
        AVG(price) as avg_price
    FROM products
    GROUP BY category
)
SELECT 
    category,
    product_count,
    avg_price,
    ROW_NUMBER() OVER (ORDER BY product_count DESC) as rank
FROM category_stats;

NULL Handling in Aggregates

SELECT 
    COUNT(*) as total_rows,
    COUNT(phone) as rows_with_phone,
    COUNT(email) as rows_with_email,
    AVG(age) as avg_age  -- NULLs are ignored
FROM users;

Key point: COUNT(*) counts all rows (including NULLs). COUNT(column) counts non-NULL values only.

Performance Tips

Pattern	Performance	Why
GROUP BY indexed column	🟢 Fast	Index helps
GROUP BY non-indexed	⚠️ Medium	Full scan, then sort
Many groups (1000+)	🔴 Slow	Large result set
HAVING before GROUP BY	🔴 Wrong	HAVING is for post-group

Common Gotchas

Can’t use non-aggregated columns in SELECT without GROUP BY. This is an error:

-- ❌ Error: category is not aggregated
SELECT category, product_name, COUNT(*) as count
FROM products
GROUP BY category;
 
-- ✅ Correct: all non-aggregated columns in GROUP BY
SELECT category, COUNT(*) as count
FROM products
GROUP BY category;

HAVING vs WHERE. You must use HAVING for aggregate conditions.

-- ❌ Error: WHERE doesn't work with aggregates
SELECT category, COUNT(*) as count
FROM products
WHERE COUNT(*) > 5
GROUP BY category;
 
-- ✅ Correct: use HAVING
SELECT category, COUNT(*) as count
FROM products
GROUP BY category
HAVING COUNT(*) > 5;

GROUP BY with ORDER BY. Order after grouping.

SELECT 
    category,
    COUNT(*) as count
FROM products
GROUP BY category
ORDER BY count DESC;  -- Orders the grouped results

SQL-Fundamentals — SELECT, WHERE, ORDER BY
SQL-Joins — GROUP BY with JOINs
SQL-Window-Functions — Advanced aggregations (running totals)
PostgreSQL-for-Data-Engineering — Production aggregation patterns
SQL-Learning-Roadmap — Your complete SQL learning path

Key Takeaway:
GROUP BY divides data into buckets. Aggregate functions (COUNT, SUM, AVG, MAX, MIN) summarize each bucket. HAVING filters the results. Use this for reporting, analysis, and data quality checks.

Explorer

Ben Warai Otoko

SQL Aggregations & GROUP BY

Table of Contents

Aggregate Functions: The Five Essential

COUNT: Counting Rows

Count All Rows

Count Non-NULL Values

Count Distinct Values

Multiple Counts

SUM, AVG, MAX, MIN

Simple Aggregation

Arithmetic with Aggregates

GROUP BY: Summarizing by Category

Basic GROUP BY

GROUP BY Multiple Columns

HAVING: Filtering Groups

Basic HAVING

Multiple HAVING Conditions

Common Aggregation Patterns

Daily/Weekly/Monthly Totals

Top N Analysis

Customer Cohorts

Data Quality Checks

Subqueries with GROUP BY

Filter Groups in Subquery

Rank Groups

NULL Handling in Aggregates

Performance Tips

Common Gotchas

Graph View

Backlinks

Recent Notes

Second Brain

Projects

Blog Posts

Learning Logs

Explorer

SQL Aggregations & GROUP BY

Table of Contents

Aggregate Functions: The Five Essential

COUNT: Counting Rows

Count All Rows

Count Non-NULL Values

Count Distinct Values

Multiple Counts

SUM, AVG, MAX, MIN

Simple Aggregation

Arithmetic with Aggregates

GROUP BY: Summarizing by Category

Basic GROUP BY

GROUP BY Multiple Columns

HAVING: Filtering Groups

Basic HAVING

Multiple HAVING Conditions

Common Aggregation Patterns

Daily/Weekly/Monthly Totals

Top N Analysis

Customer Cohorts

Data Quality Checks

Subqueries with GROUP BY

Filter Groups in Subquery

Rank Groups

NULL Handling in Aggregates

Performance Tips

Common Gotchas

Related

Graph View

Backlinks

Recent Notes

Second Brain

Projects

Blog Posts

Learning Logs