Introduction to SQL Window Functions

Malcolm Katzenbach
4 min readAug 29, 2021

In previous posts, I have written about aggregate functions such as MIN() and MAX() or the GROUP BY statement. The aggregate functions can help in the analysis of different data tables. Window functions is a more advanced way of working with aggregate data and is the main focus for this introductory post.

There is one major difference between using an aggregate function and using a window function and that is how each returns the information. The aggregate function returns the results into a single row, while a window function will perform its function on each individual row based on the observations that are related to that row. Let’s look at an example to make sure we understand this difference.

We return to a database I have used in previous posts: an online store. There are tables of customer information, orders, and products. Perhaps we want to know how many orders were made in specific geographic regions like specific States within the US. Using a normal aggregate function, the query may look like the following depending on the database schema.

SELECT state, COUNT(*) FROM clients
JOIN orders ON clients.id = orders.client_id
GROUP BY state;

This would return a single row for each State that had orders and a count of how many orders there were, such as 200 orders from New York and 130 orders from New Jersey. So all observations from each State are condensed into their own row.

Window functions work slightly differently in that they do not condense the observations into a single line, but instead there is an additional column where the results of the function are placed. Instead of using a GROUP BY statement to aggregate the data based on specific criteria, the window functions will partition the data into different “windows” based on the given criteria.

So in our example the window functions would create a window for each state in the joined table and perform the function on each window with the results added on to a new column in the table results. This is useful because it gives us the capability of looking at more information from each observation while still having the results of the function in a new column.

Let’s consider using another function with the same data. In the orders table, there might be a column representing the costs of each order. We want to know the average cost for orders in each State. Using normal aggregate functions, we would only have one row with the average order cost for each State. However, if we used the window function version instead, we could see each individual order cost and be able to compare it to the average order cost for that State.

Next we will go over the syntax so can we can write a query to complete our example.

Window Function Syntax

To use a window function we need to use the OVER clause after calling the function. Depending on how the windows are created there are two versions for using the OVER clause.

OVER(<window_criteria>)orOVER <window_name>

The <window_criteria> has several optional variables. There is <window_name> , <partition_clause> , <order_clause> , and <frame_clause> .

  1. The <window_name> is the name of the window that is declared elsewhere in the query.
  2. The <partition_clause> is the condition that separates the observations into windows. Under standard SQL, this clause can only use column names, however in MySQL you can also enter expressions. Something to note is that if the <partition_clause> is empty, the computer will act as if there is a single partition holding all the observations.
  3. The <order_clause> sorts the rows within each partition. If the <order_clause> remains empty, the rows will stay unordered.
  4. The <frame_clause> allows the creation of a subset frame within the partition.

Example Query

Let us return to our example using the average aggregate function as a window function to find the average cost of the orders in each State.

SELECT 
clients.state, orders.id, orders.cost,
AVG(orders.cost) OVER(PARTITION BY clients.state ORDER BY orders.cost DESC) AS st_avg_order_cost
FROM orders
LEFT JOIN clients
ON orders.client_id = clients.id

So going over this query, we have joined the clients and orders tables and are selecting the state, orders id and orders cost columns from the joined table. We declare the average aggregate function followed by the OVER clause to create a window function. The windows are partitioned by the State of residence for the client making the order and the resulting windows from partitioning are sorted from greatest cost to lowest.This query will allow us to compare each order made in a State with its average.

One final note is that there are a number of functions that are only available as window functions (such as DENSE_RANK(), FIRST_VALUE(), LAST_VALUE(), etc.) and that most aggregate functions can be used as window functions.

I hope this post helps build up your understanding of the useful tools known as window functions. Thanks for reading.

For more documentation on using window functions in MySQL:

--

--