Window Functions
Window functions compute values across a set of rows related to the current row, without collapsing them into a single output. Examples use the sample data.
Syntax
function_name(args) OVER (
[PARTITION BY partition_expr]
[ORDER BY order_expr]
[frame_clause]
)- PARTITION BY - Divide rows into groups
- ORDER BY - Define order within each partition
- Frame clause - Specify which rows to include
ROW_NUMBER
Assign sequential numbers to rows within each partition.
Syntax:
ROW_NUMBER() OVER (ORDER BY ...)- Sequential number starting at 1ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...)- Reset for each partition
Example:
SELECT
order_id,
customer_id,
amount,
ROW_NUMBER() OVER (ORDER BY amount DESC) as overall_rank,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY amount DESC) as customer_rank
FROM orders| order_id | customer_id | amount | overall_rank | customer_rank |
|---|---|---|---|---|
| ord_8 | c5 | 450.00 | 1 | 1 |
| ord_5 | c2 | 300.00 | 2 | 1 |
| ord_3 | c1 | 200.00 | 3 | 1 |
| ord_1 | c1 | 150.00 | 4 | 2 |
| ord_6 | c4 | 125.00 | 5 | 1 |
| ord_7 | c1 | 89.99 | 6 | 3 |
| ord_2 | c2 | 75.50 | 7 | 2 |
| ord_4 | c3 | 50.00 | 8 | 1 |
RANK
Assign rank with gaps for ties. Rows with equal values get the same rank.
Syntax:
RANK() OVER (ORDER BY ...)- Rank with gaps
Example:
SELECT
order_id,
region,
amount,
RANK() OVER (ORDER BY region) as region_rank
FROM orders| order_id | region | amount | region_rank |
|---|---|---|---|
| ord_4 | eu-west | 50.00 | 1 |
| ord_8 | eu-west | 450.00 | 1 |
| ord_1 | us-east | 150.00 | 3 |
| ord_3 | us-east | 200.00 | 3 |
| ord_6 | us-east | 125.00 | 3 |
| ord_7 | us-east | 89.99 | 3 |
| ord_2 | us-west | 75.50 | 7 |
| ord_5 | us-west | 300.00 | 7 |
Note: Rank jumps from 1 to 3 (skips 2) because there are two eu-west orders tied at rank 1.
DENSE_RANK
Assign rank without gaps for ties.
Syntax:
DENSE_RANK() OVER (ORDER BY ...)- Rank without gaps
Example:
SELECT
order_id,
region,
amount,
RANK() OVER (ORDER BY region) as rank,
DENSE_RANK() OVER (ORDER BY region) as dense_rank
FROM orders| order_id | region | amount | rank | dense_rank |
|---|---|---|---|---|
| ord_4 | eu-west | 50.00 | 1 | 1 |
| ord_8 | eu-west | 450.00 | 1 | 1 |
| ord_1 | us-east | 150.00 | 3 | 2 |
| ord_3 | us-east | 200.00 | 3 | 2 |
| ord_6 | us-east | 125.00 | 3 | 2 |
| ord_7 | us-east | 89.99 | 3 | 2 |
| ord_2 | us-west | 75.50 | 7 | 3 |
| ord_5 | us-west | 300.00 | 7 | 3 |
Note: DENSE_RANK goes 1, 2, 3 (no gaps) while RANK goes 1, 3, 7 (has gaps).
NTILE
Divide rows into N equal buckets.
Syntax:
NTILE(n) OVER (ORDER BY ...)- Divide into n buckets
Example:
SELECT
order_id,
amount,
NTILE(4) OVER (ORDER BY amount DESC) as quartile
FROM orders| order_id | amount | quartile |
|---|---|---|
| ord_8 | 450.00 | 1 |
| ord_5 | 300.00 | 1 |
| ord_3 | 200.00 | 2 |
| ord_1 | 150.00 | 2 |
| ord_6 | 125.00 | 3 |
| ord_7 | 89.99 | 3 |
| ord_2 | 75.50 | 4 |
| ord_4 | 50.00 | 4 |
Quartile 1 = top 25% (highest amounts), Quartile 4 = bottom 25%.
LAG
Access a previous row's value.
Syntax:
LAG(expression) OVER (ORDER BY ...)- Previous row's valueLAG(expression, offset) OVER (...)- Value from n rows backLAG(expression, offset, default) OVER (...)- With default for first rows
Example:
SELECT
order_id,
customer_id,
timestamp,
amount,
LAG(amount) OVER (PARTITION BY customer_id ORDER BY timestamp) as prev_amount,
amount - LAG(amount) OVER (PARTITION BY customer_id ORDER BY timestamp) as amount_change
FROM orders
WHERE customer_id IN ('c1', 'c2')| order_id | customer_id | timestamp | amount | prev_amount | amount_change |
|---|---|---|---|---|---|
| ord_1 | c1 | 10:00:15 | 150.00 | NULL | NULL |
| ord_3 | c1 | 10:01:10 | 200.00 | 150.00 | 50.00 |
| ord_7 | c1 | 10:03:30 | 89.99 | 200.00 | -110.01 |
| ord_2 | c2 | 10:00:35 | 75.50 | NULL | NULL |
| ord_5 | c2 | 10:02:20 | 300.00 | 75.50 | 224.50 |
LEAD
Access a following row's value.
Syntax:
LEAD(expression) OVER (ORDER BY ...)- Next row's valueLEAD(expression, offset) OVER (...)- Value from n rows aheadLEAD(expression, offset, default) OVER (...)- With default for last rows
Example:
SELECT
event_id,
user_id,
timestamp,
page,
LEAD(page) OVER (PARTITION BY user_id ORDER BY timestamp) as next_page,
LEAD(timestamp) OVER (PARTITION BY user_id ORDER BY timestamp) - timestamp as time_to_next
FROM events
WHERE user_id = 'u1'| event_id | user_id | timestamp | page | next_page | time_to_next |
|---|---|---|---|---|---|
| e1 | u1 | 10:00:05 | /home | /products | 7 seconds |
| e2 | u1 | 10:00:12 | /products | /products/1 | 13 seconds |
| e4 | u1 | 10:00:25 | /products/1 | /checkout | 20 seconds |
| e6 | u1 | 10:00:45 | /checkout | NULL | NULL |
FIRST_VALUE
Get the first value in the window frame.
Syntax:
FIRST_VALUE(expression) OVER (PARTITION BY ... ORDER BY ...)- First value
Example:
SELECT
event_id,
user_id,
page,
timestamp,
FIRST_VALUE(page) OVER (
PARTITION BY user_id
ORDER BY timestamp
) as landing_page
FROM events| event_id | user_id | page | timestamp | landing_page |
|---|---|---|---|---|
| e1 | u1 | /home | 10:00:05 | /home |
| e2 | u1 | /products | 10:00:12 | /home |
| e4 | u1 | /products/1 | 10:00:25 | /home |
| e6 | u1 | /checkout | 10:00:45 | /home |
| e3 | u2 | /home | 10:00:18 | /home |
| e5 | u2 | /about | 10:00:32 | /home |
| e8 | u2 | /products | 10:01:15 | /home |
| e7 | u3 | /home | 10:01:00 | /home |
LAST_VALUE
Get the last value in the window frame.
Syntax:
LAST_VALUE(expression) OVER (... ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)- Last value
Note: Requires explicit frame to include all rows.
Example:
SELECT
event_id,
user_id,
page,
timestamp,
LAST_VALUE(page) OVER (
PARTITION BY user_id
ORDER BY timestamp
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) as exit_page
FROM events| event_id | user_id | page | timestamp | exit_page |
|---|---|---|---|---|
| e1 | u1 | /home | 10:00:05 | /checkout |
| e2 | u1 | /products | 10:00:12 | /checkout |
| e4 | u1 | /products/1 | 10:00:25 | /checkout |
| e6 | u1 | /checkout | 10:00:45 | /checkout |
| e3 | u2 | /home | 10:00:18 | /products |
| e5 | u2 | /about | 10:00:32 | /products |
| e8 | u2 | /products | 10:01:15 | /products |
| e7 | u3 | /home | 10:01:00 | /home |
NTH_VALUE
Get the Nth value in the window frame.
Syntax:
NTH_VALUE(expression, n) OVER (...)- Value at position n (1-indexed)
Example:
SELECT
order_id,
region,
amount,
NTH_VALUE(order_id, 1) OVER (
PARTITION BY region
ORDER BY amount DESC
) as top_order,
NTH_VALUE(order_id, 2) OVER (
PARTITION BY region
ORDER BY amount DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) as second_order
FROM orders| order_id | region | amount | top_order | second_order |
|---|---|---|---|---|
| ord_8 | eu-west | 450.00 | ord_8 | ord_4 |
| ord_4 | eu-west | 50.00 | ord_8 | ord_4 |
| ord_3 | us-east | 200.00 | ord_3 | ord_1 |
| ord_1 | us-east | 150.00 | ord_3 | ord_1 |
| ord_6 | us-east | 125.00 | ord_3 | ord_1 |
| ord_7 | us-east | 89.99 | ord_3 | ord_1 |
| ord_5 | us-west | 300.00 | ord_5 | ord_2 |
| ord_2 | us-west | 75.50 | ord_5 | ord_2 |
PERCENT_RANK
Calculate relative rank as a percentage (0 to 1).
Syntax:
PERCENT_RANK() OVER (ORDER BY ...)- (rank - 1) / (total rows - 1)
Example:
SELECT
order_id,
amount,
PERCENT_RANK() OVER (ORDER BY amount) as percentile
FROM orders| order_id | amount | percentile |
|---|---|---|
| ord_4 | 50.00 | 0.00 |
| ord_2 | 75.50 | 0.14 |
| ord_7 | 89.99 | 0.29 |
| ord_6 | 125.00 | 0.43 |
| ord_1 | 150.00 | 0.57 |
| ord_3 | 200.00 | 0.71 |
| ord_5 | 300.00 | 0.86 |
| ord_8 | 450.00 | 1.00 |
First row = 0.00, last row = 1.00.
CUME_DIST
Cumulative distribution: fraction of rows ≤ current row.
Syntax:
CUME_DIST() OVER (ORDER BY ...)- Cumulative distribution (0 to 1)
Example:
SELECT
order_id,
amount,
CUME_DIST() OVER (ORDER BY amount) as cumulative_dist
FROM orders| order_id | amount | cumulative_dist |
|---|---|---|
| ord_4 | 50.00 | 0.125 |
| ord_2 | 75.50 | 0.250 |
| ord_7 | 89.99 | 0.375 |
| ord_6 | 125.00 | 0.500 |
| ord_1 | 150.00 | 0.625 |
| ord_3 | 200.00 | 0.750 |
| ord_5 | 300.00 | 0.875 |
| ord_8 | 450.00 | 1.000 |
12.5% of orders are ≤ $50, 50% are ≤ $125, etc.
Aggregates as Window Functions
Any aggregate function can be used as a window function.
Example:
SELECT
order_id,
customer_id,
timestamp,
amount,
SUM(amount) OVER (ORDER BY timestamp) as running_total,
AVG(amount) OVER (ORDER BY timestamp) as running_avg,
COUNT(*) OVER (ORDER BY timestamp) as order_count
FROM orders| order_id | customer_id | timestamp | amount | running_total | running_avg | order_count |
|---|---|---|---|---|---|---|
| ord_1 | c1 | 10:00:15 | 150.00 | 150.00 | 150.00 | 1 |
| ord_2 | c2 | 10:00:35 | 75.50 | 225.50 | 112.75 | 2 |
| ord_3 | c1 | 10:01:10 | 200.00 | 425.50 | 141.83 | 3 |
| ord_4 | c3 | 10:01:45 | 50.00 | 475.50 | 118.88 | 4 |
| ord_5 | c2 | 10:02:20 | 300.00 | 775.50 | 155.10 | 5 |
| ord_6 | c4 | 10:02:55 | 125.00 | 900.50 | 150.08 | 6 |
| ord_7 | c1 | 10:03:30 | 89.99 | 990.49 | 141.50 | 7 |
| ord_8 | c5 | 10:04:05 | 450.00 | 1440.49 | 180.06 | 8 |
Frame Specifications
Control which rows are included in the window frame.
Frame Bounds
| Bound | Meaning |
|---|---|
UNBOUNDED PRECEDING | First row of partition |
n PRECEDING | n rows before current |
CURRENT ROW | Current row |
n FOLLOWING | n rows after current |
UNBOUNDED FOLLOWING | Last row of partition |
ROWS BETWEEN
Example: Rolling 3-row average
SELECT
order_id,
timestamp,
amount,
AVG(amount) OVER (
ORDER BY timestamp
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) as rolling_3_avg
FROM orders| order_id | timestamp | amount | rolling_3_avg |
|---|---|---|---|
| ord_1 | 10:00:15 | 150.00 | 150.00 |
| ord_2 | 10:00:35 | 75.50 | 112.75 |
| ord_3 | 10:01:10 | 200.00 | 141.83 |
| ord_4 | 10:01:45 | 50.00 | 108.50 |
| ord_5 | 10:02:20 | 300.00 | 183.33 |
| ord_6 | 10:02:55 | 125.00 | 158.33 |
| ord_7 | 10:03:30 | 89.99 | 171.66 |
| ord_8 | 10:04:05 | 450.00 | 221.66 |
Patterns
Top N per Group
WITH ranked AS (
SELECT
region,
order_id,
amount,
ROW_NUMBER() OVER (
PARTITION BY region
ORDER BY amount DESC
) as rank
FROM orders
)
SELECT region, order_id, amount
FROM ranked
WHERE rank <= 2| region | order_id | amount |
|---|---|---|
| eu-west | ord_8 | 450.00 |
| eu-west | ord_4 | 50.00 |
| us-east | ord_3 | 200.00 |
| us-east | ord_1 | 150.00 |
| us-west | ord_5 | 300.00 |
| us-west | ord_2 | 75.50 |
Percent of Total
SELECT
order_id,
region,
amount,
amount * 100.0 / SUM(amount) OVER () as pct_of_total,
amount * 100.0 / SUM(amount) OVER (PARTITION BY region) as pct_of_region
FROM orders| order_id | region | amount | pct_of_total | pct_of_region |
|---|---|---|---|---|
| ord_1 | us-east | 150.00 | 10.41 | 26.55 |
| ord_3 | us-east | 200.00 | 13.88 | 35.40 |
| ord_6 | us-east | 125.00 | 8.68 | 22.12 |
| ord_7 | us-east | 89.99 | 6.25 | 15.93 |
| ord_2 | us-west | 75.50 | 5.24 | 20.11 |
| ord_5 | us-west | 300.00 | 20.83 | 79.89 |
| ord_4 | eu-west | 50.00 | 3.47 | 10.00 |
| ord_8 | eu-west | 450.00 | 31.24 | 90.00 |
Change Detection
SELECT
metric_id,
name,
timestamp,
value,
value - LAG(value) OVER (
PARTITION BY name
ORDER BY timestamp
) as value_change,
CASE
WHEN value > LAG(value) OVER (PARTITION BY name ORDER BY timestamp) * 1.1
THEN 'spike'
WHEN value < LAG(value) OVER (PARTITION BY name ORDER BY timestamp) * 0.9
THEN 'drop'
ELSE 'normal'
END as status
FROM metrics
WHERE name = 'cpu_usage'| metric_id | name | timestamp | value | value_change | status |
|---|---|---|---|---|---|
| m1 | cpu_usage | 10:00:00 | 45.5 | NULL | normal |
| m2 | cpu_usage | 10:01:00 | 62.3 | 16.8 | spike |
| m6 | cpu_usage | 10:02:00 | 71.2 | 8.9 | spike |
Gap Detection
SELECT
event_id,
user_id,
timestamp,
LAG(timestamp) OVER (PARTITION BY user_id ORDER BY timestamp) as prev_time,
timestamp - LAG(timestamp) OVER (PARTITION BY user_id ORDER BY timestamp) as gap,
CASE
WHEN timestamp - LAG(timestamp) OVER (PARTITION BY user_id ORDER BY timestamp) > INTERVAL '20 seconds'
THEN 'gap_detected'
ELSE 'normal'
END as status
FROM events
WHERE user_id = 'u1'| event_id | user_id | timestamp | prev_time | gap | status |
|---|---|---|---|---|---|
| e1 | u1 | 10:00:05 | NULL | NULL | normal |
| e2 | u1 | 10:00:12 | 10:00:05 | 7 sec | normal |
| e4 | u1 | 10:00:25 | 10:00:12 | 13 sec | normal |
| e6 | u1 | 10:00:45 | 10:00:25 | 20 sec | normal |
Running Totals per Customer
SELECT
order_id,
customer_id,
timestamp,
amount,
SUM(amount) OVER (
PARTITION BY customer_id
ORDER BY timestamp
) as customer_running_total
FROM orders
WHERE customer_id IN ('c1', 'c2')| order_id | customer_id | timestamp | amount | customer_running_total |
|---|---|---|---|---|
| ord_1 | c1 | 10:00:15 | 150.00 | 150.00 |
| ord_3 | c1 | 10:01:10 | 200.00 | 350.00 |
| ord_7 | c1 | 10:03:30 | 89.99 | 439.99 |
| ord_2 | c2 | 10:00:35 | 75.50 | 75.50 |
| ord_5 | c2 | 10:02:20 | 300.00 | 375.50 |