MIN_BY and MAX_BY in Snowflake
Have you ever wanted to look up the minimum or maximum value that corresponds to another column? Common questions like this come up often:
- What is the latest order date per customer
- What is the latest price by product

What many engineers don't know is that Snowflake (and most modern DBMSs like DuckDB, Databricks, BigQuery, etc.) include two powerful aggregate functions, MAX_BY and MIN_BY, that drastically simply this query pattern.
In this blog we'll cover:
- What you used to do without these functions
- How
MIN_BYandMAX_BYwork
Let's begin by walking through an example, but feel free to skip to how it's used via the table of contents on the left.
Use Case
Consider the following table:
┌────────────┬──────────────┬────────────┬────────┬───────────────┐
│ product_id │ product_name │ date │ price │ has_promotion │
│ int64 │ varchar │ date │ double │ boolean │
├────────────┼──────────────┼────────────┼────────┼───────────────┤
│ 1 │ lettuce │ 2025-07-12 │ 3.12 │ true │
│ 1 │ lettuce │ 2025-11-03 │ 3.99 │ false │
│ 2 │ bread │ 2016-06-02 │ 5.99 │ false │
│ 3 │ milk │ 2023-03-23 │ 5.94 │ true │
│ 5 │ apple │ 2021-02-01 │ 0.54 │ true │
│ 5 │ apple │ 2023-01-29 │ 1.21 │ true │
│ 5 │ apple │ 2025-10-11 │ 0.99 │ false │
└────────────┴──────────────┴────────────┴────────┴───────────────┘We want to find the latest price for each product and the date in which that price came into effect:
┌──────────────┬─────────────┬──────────────┐
│ product_name │ latest_date │ latest_price │
│ varchar │ date │ double │
├──────────────┼─────────────┼──────────────┤
│ apple │ 2025-10-11 │ 0.99 │
│ bread │ 2016-06-02 │ 5.99 │
│ lettuce │ 2025-11-03 │ 3.99 │
│ milk │ 2023-03-23 │ 5.94 │
└──────────────┴─────────────┴──────────────┘The Two Old Ways
Normally, we would use a self join, ending up with a query that is unnecessarily complex.
WITH latest_dates AS (
-- find the latest date for each product
SELECT
product_id,
MAX(date) as latest_date
FROM test
GROUP BY 1
)
-- join each product to its latest date
SELECT
t.product_name,
ld.latest_date,
t.price as latest_price
FROM test AS t
JOIN latest_dates AS ld
ON t.product_id = ld.product_id
AND t.date = ld.latest_date;Alternatively, you could also have used a window function.
SELECT
DISTINCT
product_name,
MAX(date) OVER (PARTITION BY product_name ORDER BY date DESC) as latest_date,
FIRST_VALUE(price) OVER (PARTITION BY product_name ORDER BY date DESC) as latest_price
FROM test;The New Way
With Snowflake's MAX_BY, the query can be rewritten as:
SELECT
product_name,
MAX(date) as latest_date,
MAX_BY(price, date) as latest_price
FROM test
GROUP BY product_name;How to use MAX_BY and MIN_BY
For simplicity's sake, we will only cover MAX_BY as the descriptions below apply to MIN_BY as well.
In Snowflake and DuckDB, MAX_BY and MIN_BY accepts 3 parameters.
MAX_BY(
<column_to_return>,
<column_containing_maximum_value>,
[<number_of_values_to_return>]
)For example, if a table contains the columns product_id and date, MAX_BY(product_id, date) returns the value of the product_id column for the row that has the highest value in the date column.
Arguments
<column_to_return>
This specifies the column to return. In our previous example, we want the price associated with the latest date. If we instead wanted to determine whether the latest date includes a promotion, we would replace this with has_promotion.
SELECT
product_name,
MAX_BY(has_promotion, date) as has_promotion
FROM test
GROUP BY product_name;┌──────────────┬───────────────┐
│ product_name │ has_promotion │
│ varchar │ boolean │
├──────────────┼───────────────┤
│ apple │ false │
│ bread │ false │
│ lettuce │ false │
│ milk │ true │
└──────────────┴───────────────┘<column_containing_maximum_value>
This specifies how to choose the column to return above. This is the column whose maximum value determines which value is selected.
┌──────────────┬────────────┬────────┬───────────────┐
│ product_name │ date │ price │ has_promotion │
│ varchar │ date │ double │ boolean │
├──────────────┼────────────┼────────┼───────────────┤
│ lettuce │ 2025-07-12 │ 3.12 │ true │
│ lettuce │ 2025-11-03 │ 3.99 │ false │ <-- this is selected
│ bread │ 2016-06-02 │ 5.99 │ false │ <-- this is selected
│ milk │ 2023-03-23 │ 5.94 │ true │ <-- this is selected
│ apple │ 2021-02-01 │ 0.54 │ true │
│ apple │ 2023-01-29 │ 1.21 │ true │
│ apple │ 2025-10-11 │ 0.99 │ false │ <-- this is selected
└──────────────┴────────────┴────────┴───────────────┘<number_of_values_to_return>
This optional integer specifies how many values to pull from <column_to_return> based on the <column_containing_maximum_value>.
When this argument is provided, MAX_BY returns an array of values ordered by the <column_containing_maximum_value> in descending order.
For example, if we want to see last two product prices based on the most recent dates, we can write:
SELECT
product_name,
MAX_BY(price, date, 2) as latest_price
FROM test
GROUP BY product_name;┌──────────────┬──────────────┐
│ product_name │ latest_price │
│ varchar │ double[] │
├──────────────┼──────────────┤
│ apple │ [0.99, 1.21] │
│ bread │ [5.99] │
│ lettuce │ [3.99, 3.12] │
│ milk │ [5.94] │
└──────────────┴──────────────┘This means for apples, the last two prices, ordered by the latest dates, are $0.99 and $1.21.
Additional notes:
This function ignores NULLs in <column_containing_maximum_value> and returns NULL if all values in <column_containing_maximum_value> are NULL.
Wrapping Up
Whenever you're tempted to grab the latest, highest, lowest, or top N values per group, you can now do it in a single clean expression.
Give these functions a try in your own warehouse, they’re one of those features that make SQL feel surprisingly elegant.
Comments ()