Work in Progress: This page is under development. Use the feedback button on the bottom right to help us improve it.

Window Functions

Window functions compute values across a set of rows related to the current row, without collapsing them into a single output. Examples use the sample data.


Syntax

function_name(args) OVER (
  [PARTITION BY partition_expr]
  [ORDER BY order_expr]
  [frame_clause]
)
  • PARTITION BY - Divide rows into groups
  • ORDER BY - Define order within each partition
  • Frame clause - Specify which rows to include

ROW_NUMBER

Assign sequential numbers to rows within each partition.

Syntax:

  • ROW_NUMBER() OVER (ORDER BY ...) - Sequential number starting at 1
  • ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...) - Reset for each partition

Example:

SELECT
  order_id,
  customer_id,
  amount,
  ROW_NUMBER() OVER (ORDER BY amount DESC) as overall_rank,
  ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY amount DESC) as customer_rank
FROM orders
order_idcustomer_idamountoverall_rankcustomer_rank
ord_8c5450.0011
ord_5c2300.0021
ord_3c1200.0031
ord_1c1150.0042
ord_6c4125.0051
ord_7c189.9963
ord_2c275.5072
ord_4c350.0081

RANK

Assign rank with gaps for ties. Rows with equal values get the same rank.

Syntax:

  • RANK() OVER (ORDER BY ...) - Rank with gaps

Example:

SELECT
  order_id,
  region,
  amount,
  RANK() OVER (ORDER BY region) as region_rank
FROM orders
order_idregionamountregion_rank
ord_4eu-west50.001
ord_8eu-west450.001
ord_1us-east150.003
ord_3us-east200.003
ord_6us-east125.003
ord_7us-east89.993
ord_2us-west75.507
ord_5us-west300.007

Note: Rank jumps from 1 to 3 (skips 2) because there are two eu-west orders tied at rank 1.


DENSE_RANK

Assign rank without gaps for ties.

Syntax:

  • DENSE_RANK() OVER (ORDER BY ...) - Rank without gaps

Example:

SELECT
  order_id,
  region,
  amount,
  RANK() OVER (ORDER BY region) as rank,
  DENSE_RANK() OVER (ORDER BY region) as dense_rank
FROM orders
order_idregionamountrankdense_rank
ord_4eu-west50.0011
ord_8eu-west450.0011
ord_1us-east150.0032
ord_3us-east200.0032
ord_6us-east125.0032
ord_7us-east89.9932
ord_2us-west75.5073
ord_5us-west300.0073

Note: DENSE_RANK goes 1, 2, 3 (no gaps) while RANK goes 1, 3, 7 (has gaps).


NTILE

Divide rows into N equal buckets.

Syntax:

  • NTILE(n) OVER (ORDER BY ...) - Divide into n buckets

Example:

SELECT
  order_id,
  amount,
  NTILE(4) OVER (ORDER BY amount DESC) as quartile
FROM orders
order_idamountquartile
ord_8450.001
ord_5300.001
ord_3200.002
ord_1150.002
ord_6125.003
ord_789.993
ord_275.504
ord_450.004

Quartile 1 = top 25% (highest amounts), Quartile 4 = bottom 25%.


LAG

Access a previous row's value.

Syntax:

  • LAG(expression) OVER (ORDER BY ...) - Previous row's value
  • LAG(expression, offset) OVER (...) - Value from n rows back
  • LAG(expression, offset, default) OVER (...) - With default for first rows

Example:

SELECT
  order_id,
  customer_id,
  timestamp,
  amount,
  LAG(amount) OVER (PARTITION BY customer_id ORDER BY timestamp) as prev_amount,
  amount - LAG(amount) OVER (PARTITION BY customer_id ORDER BY timestamp) as amount_change
FROM orders
WHERE customer_id IN ('c1', 'c2')
order_idcustomer_idtimestampamountprev_amountamount_change
ord_1c110:00:15150.00NULLNULL
ord_3c110:01:10200.00150.0050.00
ord_7c110:03:3089.99200.00-110.01
ord_2c210:00:3575.50NULLNULL
ord_5c210:02:20300.0075.50224.50

LEAD

Access a following row's value.

Syntax:

  • LEAD(expression) OVER (ORDER BY ...) - Next row's value
  • LEAD(expression, offset) OVER (...) - Value from n rows ahead
  • LEAD(expression, offset, default) OVER (...) - With default for last rows

Example:

SELECT
  event_id,
  user_id,
  timestamp,
  page,
  LEAD(page) OVER (PARTITION BY user_id ORDER BY timestamp) as next_page,
  LEAD(timestamp) OVER (PARTITION BY user_id ORDER BY timestamp) - timestamp as time_to_next
FROM events
WHERE user_id = 'u1'
event_iduser_idtimestamppagenext_pagetime_to_next
e1u110:00:05/home/products7 seconds
e2u110:00:12/products/products/113 seconds
e4u110:00:25/products/1/checkout20 seconds
e6u110:00:45/checkoutNULLNULL

FIRST_VALUE

Get the first value in the window frame.

Syntax:

  • FIRST_VALUE(expression) OVER (PARTITION BY ... ORDER BY ...) - First value

Example:

SELECT
  event_id,
  user_id,
  page,
  timestamp,
  FIRST_VALUE(page) OVER (
    PARTITION BY user_id
    ORDER BY timestamp
  ) as landing_page
FROM events
event_iduser_idpagetimestamplanding_page
e1u1/home10:00:05/home
e2u1/products10:00:12/home
e4u1/products/110:00:25/home
e6u1/checkout10:00:45/home
e3u2/home10:00:18/home
e5u2/about10:00:32/home
e8u2/products10:01:15/home
e7u3/home10:01:00/home

LAST_VALUE

Get the last value in the window frame.

Syntax:

  • LAST_VALUE(expression) OVER (... ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) - Last value

Note: Requires explicit frame to include all rows.

Example:

SELECT
  event_id,
  user_id,
  page,
  timestamp,
  LAST_VALUE(page) OVER (
    PARTITION BY user_id
    ORDER BY timestamp
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) as exit_page
FROM events
event_iduser_idpagetimestampexit_page
e1u1/home10:00:05/checkout
e2u1/products10:00:12/checkout
e4u1/products/110:00:25/checkout
e6u1/checkout10:00:45/checkout
e3u2/home10:00:18/products
e5u2/about10:00:32/products
e8u2/products10:01:15/products
e7u3/home10:01:00/home

NTH_VALUE

Get the Nth value in the window frame.

Syntax:

  • NTH_VALUE(expression, n) OVER (...) - Value at position n (1-indexed)

Example:

SELECT
  order_id,
  region,
  amount,
  NTH_VALUE(order_id, 1) OVER (
    PARTITION BY region
    ORDER BY amount DESC
  ) as top_order,
  NTH_VALUE(order_id, 2) OVER (
    PARTITION BY region
    ORDER BY amount DESC
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) as second_order
FROM orders
order_idregionamounttop_ordersecond_order
ord_8eu-west450.00ord_8ord_4
ord_4eu-west50.00ord_8ord_4
ord_3us-east200.00ord_3ord_1
ord_1us-east150.00ord_3ord_1
ord_6us-east125.00ord_3ord_1
ord_7us-east89.99ord_3ord_1
ord_5us-west300.00ord_5ord_2
ord_2us-west75.50ord_5ord_2

PERCENT_RANK

Calculate relative rank as a percentage (0 to 1).

Syntax:

  • PERCENT_RANK() OVER (ORDER BY ...) - (rank - 1) / (total rows - 1)

Example:

SELECT
  order_id,
  amount,
  PERCENT_RANK() OVER (ORDER BY amount) as percentile
FROM orders
order_idamountpercentile
ord_450.000.00
ord_275.500.14
ord_789.990.29
ord_6125.000.43
ord_1150.000.57
ord_3200.000.71
ord_5300.000.86
ord_8450.001.00

First row = 0.00, last row = 1.00.


CUME_DIST

Cumulative distribution: fraction of rows ≤ current row.

Syntax:

  • CUME_DIST() OVER (ORDER BY ...) - Cumulative distribution (0 to 1)

Example:

SELECT
  order_id,
  amount,
  CUME_DIST() OVER (ORDER BY amount) as cumulative_dist
FROM orders
order_idamountcumulative_dist
ord_450.000.125
ord_275.500.250
ord_789.990.375
ord_6125.000.500
ord_1150.000.625
ord_3200.000.750
ord_5300.000.875
ord_8450.001.000

12.5% of orders are ≤ $50, 50% are ≤ $125, etc.


Aggregates as Window Functions

Any aggregate function can be used as a window function.

Example:

SELECT
  order_id,
  customer_id,
  timestamp,
  amount,
  SUM(amount) OVER (ORDER BY timestamp) as running_total,
  AVG(amount) OVER (ORDER BY timestamp) as running_avg,
  COUNT(*) OVER (ORDER BY timestamp) as order_count
FROM orders
order_idcustomer_idtimestampamountrunning_totalrunning_avgorder_count
ord_1c110:00:15150.00150.00150.001
ord_2c210:00:3575.50225.50112.752
ord_3c110:01:10200.00425.50141.833
ord_4c310:01:4550.00475.50118.884
ord_5c210:02:20300.00775.50155.105
ord_6c410:02:55125.00900.50150.086
ord_7c110:03:3089.99990.49141.507
ord_8c510:04:05450.001440.49180.068

Frame Specifications

Control which rows are included in the window frame.

Frame Bounds

BoundMeaning
UNBOUNDED PRECEDINGFirst row of partition
n PRECEDINGn rows before current
CURRENT ROWCurrent row
n FOLLOWINGn rows after current
UNBOUNDED FOLLOWINGLast row of partition

ROWS BETWEEN

Example: Rolling 3-row average

SELECT
  order_id,
  timestamp,
  amount,
  AVG(amount) OVER (
    ORDER BY timestamp
    ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
  ) as rolling_3_avg
FROM orders
order_idtimestampamountrolling_3_avg
ord_110:00:15150.00150.00
ord_210:00:3575.50112.75
ord_310:01:10200.00141.83
ord_410:01:4550.00108.50
ord_510:02:20300.00183.33
ord_610:02:55125.00158.33
ord_710:03:3089.99171.66
ord_810:04:05450.00221.66

Patterns

Top N per Group

WITH ranked AS (
  SELECT
    region,
    order_id,
    amount,
    ROW_NUMBER() OVER (
      PARTITION BY region
      ORDER BY amount DESC
    ) as rank
  FROM orders
)
SELECT region, order_id, amount
FROM ranked
WHERE rank <= 2
regionorder_idamount
eu-westord_8450.00
eu-westord_450.00
us-eastord_3200.00
us-eastord_1150.00
us-westord_5300.00
us-westord_275.50

Percent of Total

SELECT
  order_id,
  region,
  amount,
  amount * 100.0 / SUM(amount) OVER () as pct_of_total,
  amount * 100.0 / SUM(amount) OVER (PARTITION BY region) as pct_of_region
FROM orders
order_idregionamountpct_of_totalpct_of_region
ord_1us-east150.0010.4126.55
ord_3us-east200.0013.8835.40
ord_6us-east125.008.6822.12
ord_7us-east89.996.2515.93
ord_2us-west75.505.2420.11
ord_5us-west300.0020.8379.89
ord_4eu-west50.003.4710.00
ord_8eu-west450.0031.2490.00

Change Detection

SELECT
  metric_id,
  name,
  timestamp,
  value,
  value - LAG(value) OVER (
    PARTITION BY name
    ORDER BY timestamp
  ) as value_change,
  CASE
    WHEN value > LAG(value) OVER (PARTITION BY name ORDER BY timestamp) * 1.1
    THEN 'spike'
    WHEN value < LAG(value) OVER (PARTITION BY name ORDER BY timestamp) * 0.9
    THEN 'drop'
    ELSE 'normal'
  END as status
FROM metrics
WHERE name = 'cpu_usage'
metric_idnametimestampvaluevalue_changestatus
m1cpu_usage10:00:0045.5NULLnormal
m2cpu_usage10:01:0062.316.8spike
m6cpu_usage10:02:0071.28.9spike

Gap Detection

SELECT
  event_id,
  user_id,
  timestamp,
  LAG(timestamp) OVER (PARTITION BY user_id ORDER BY timestamp) as prev_time,
  timestamp - LAG(timestamp) OVER (PARTITION BY user_id ORDER BY timestamp) as gap,
  CASE
    WHEN timestamp - LAG(timestamp) OVER (PARTITION BY user_id ORDER BY timestamp) > INTERVAL '20 seconds'
    THEN 'gap_detected'
    ELSE 'normal'
  END as status
FROM events
WHERE user_id = 'u1'
event_iduser_idtimestampprev_timegapstatus
e1u110:00:05NULLNULLnormal
e2u110:00:1210:00:057 secnormal
e4u110:00:2510:00:1213 secnormal
e6u110:00:4510:00:2520 secnormal

Running Totals per Customer

SELECT
  order_id,
  customer_id,
  timestamp,
  amount,
  SUM(amount) OVER (
    PARTITION BY customer_id
    ORDER BY timestamp
  ) as customer_running_total
FROM orders
WHERE customer_id IN ('c1', 'c2')
order_idcustomer_idtimestampamountcustomer_running_total
ord_1c110:00:15150.00150.00
ord_3c110:01:10200.00350.00
ord_7c110:03:3089.99439.99
ord_2c210:00:3575.5075.50
ord_5c210:02:20300.00375.50