Fetching the first N rows from a result set is easy with the fetch first clause. For example, this returns the ten most recent orders. It sorts the rows by order_tms
from newest to oldest and limiting these to the first 10:
select * from co.orders order by order_tms desc fetch first 10 rows only;
Or if you’re using an archaic version of Oracle Database you can use the rownum trick.
But what if you want the top-N rows in each group? For example, the three most recent orders for each customer?
Oracle AI Database 26ai enables this by extending the fetch first
clause. If you’re stuck on older releases you need to use a subquery. In this post we’ll explore how to:
- Find the top-N rows in each group
- Get the first row in each group
- How to handle ties
- Write a reusable top-N/group function with SQL macros
- Conclusion
To run the queries, we’ll use the Customer Orders schema, which you can get from GitHub. The scripts for this post are available in this Live SQL script.
How to find the top-N rows in each group
Photo by Romain Chollet on Unsplash
Starting in 23.4, you can use the partition by
clause of fetch first
to split the rows into groups and get the top N from each.
The basics of this syntax is:
fetch first M partition by <column>, N rows only;
This splits the rows up into groups for each value of <column>
, returning the first M
groups and N
rows within each. This query returns the first 999 billion customers and the three most recent orders for each:
select o.* from co.orders o order by customer_id, order_tms desc fetch first 999999999999 partition by customer_id, 3 rows only;
You need to specify the number of partitions you want. If you have more than one trillion customers, just keep adding 9s!
Getting the top-N rows for a group on earlier releases is a two-step process:
- Assign row numbers starting at one for each group
- Filter the result of this down to the rows less than or equal to the number you want
Assigning values from one for each group
To set the row numbers, use the handy row_number()
function. This assigns sequential integers starting at one to each row, according to the sort you specify. You define this order in the over clause.
This is also where you can get the row numbers to start at one for each group. Do this with the partition by
clause. This splits the rows into groups for each unique set of values in partitioning columns.
Like a regular query, to define the sort, use the order by
clause.
So to find the most recent orders for each customer, you should:
partition by customer_id
order by order_tms desc
Giving this query:
select o.*, row_number () over ( partition by customer_id order by order_tms desc ) rn from co.orders o;
Restrict the results to the first N in each group
After generating the row numbers, you need to place this logic in a subquery. This could be an inline view or with
clause; I find CTEs easier to read so tend to go with that.
Whichever method you choose, all that’s left to do is filter the output of the row_number
expression to the most rows you want in each group:
with rws as ( select o.*, row_number () over ( partition by customer_id order by order_tms desc ) rn from co.orders o ) select * from rws where rn <= 3 order by customer_id, order_tms desc;
Pretty easy, right?
Often this is what you want, but there are a few nuances to be aware of. Let’s start with a special case: getting one row per customer. For example, the most recent order for each customer.
How to find the first row in each group

Photo by Benjamin Davies on Unsplash
You can get one row per customer and the date of their most recent order using group by:
select customer_id, max ( order_tms ) from co.orders group by customer_id;
In Oracle AI Database 26ai, tweak the partition by
query above to return 1 row only
to get the last order for each customer:
select o.* from co.orders o order by customer_id, order_tms desc fetch first 999999999999 partition by customer_id, 1 row only;
On versions 21c and earlier, it’s more challenging to add other columns to the result.
If you add these unaggregated to the select
list, you also need to group by them. Doing this gives you a row for each combination of values, so you could have many rows for each customer instead of one.
Placing the other columns in max or min also gives the incorrect result, as this returns the highest or lowest value for each customer. Not the values from the row with the most recent date.
To get only the values for the most recent row, you could use the row numbering method above.
Or there is another trick you can use: the keep
clause.
How to get values for columns with the same value as the first or last
The keep clause – referred to as the first and last functions in the docs – enables you to choose which value from the group to show. The syntax for this is:
aggregate_function keep ( dense_rank { first | last } order by <sorting columns> )
This tells the database to get the top (first
) or bottom (last
) value for the column according to the sort. If two or more rows are tied for first or last position, you take the highest value with max and the lowest with min.
So to get the store where the most recent order was placed for a customer, you need to:
- Take the min/max store_id
- Keeping the last row
- Sorted by order_tms (ascending)
Which gives this query:
select customer_id, max ( order_tms ), max ( store_id ) keep ( dense_rank last order by order_tms ) last_purchase_store_id from co.orders group by customer_id;
Note that taking the first value from an ascending sort is logically the same as the last value for a descending sort. So could also write the query above by flipping first to last and ordering oldest to newest:
select customer_id, max ( order_tms ), max ( store_id ) keep ( dense_rank first order by order_tms desc ) last_purchase_store_id from co.orders group by customer_id;
Which combination of first/last and ascending or descending sorting you use is largely a matter of personal preference. Some clauses will more accurately reflect the question the query answers.
This is cool, but could lead to lots of extra typing. Particularly if you have hundreds of columns to return that aren’t part of the grouping or sorting.
So you may be wondering: why choose this over a generic top-N/group solution, filtered to just one row?
In general you won’t! But in rare cases you may need to get values from different rows in each group.
For example, say you’ve been asked to find for each customer:
- The date of their last order
- The store where they placed their first order
- The status of the order with the highest order_id
Using the top-N per group method, you need to run three separate queries, sorting by order_tms desc, order_tms, and order_id desc respectively.
Whereas using keep first/last, you can get all these values with this query:
select customer_id, max ( order_tms ) last_order_date, max ( store_id ) keep ( dense_rank first order by order_tms ) first_purchase_store_id, max ( order_status ) keep ( dense_rank last order by order_id ) last_order_id_status from co.orders group by customer_id;
While it’s unusual to have to answer these kinds of queries, it’s worth being aware of these functions when they do crop up.
There is one final case to consider when fetching one row per group. That’s where you just don’t care which value you get for the unaggregated, ungrouped columns.
Oracle Database 19c added an easy option to do this – the any_value
function.
How to get values from a random row in the group with any_value
Sometimes you want to write a query like this:
select customer_id, max ( order_tms ), <other columns> from co.orders group by customer_id;
In this case it doesn’t matter which values you fetch for the other columns. From Oracle Database 19c you can fetch a value from random row in each group using the any_value function:
select customer_id, max ( order_tms ), any_value ( store_id ), any_value ( order_status ), any_value ( order_id ) from co.orders group by customer_id;
So why choose this over the long-standing min
or max
functions?
Firstly it more accurately reflects your intent. You just want some value, rather than explicitly the highest or lowest in the group.
Secondly, it avoids any sorting needed by min
and max
functions. On large data sets this could lead to performance gains.
Like min
and max
this ignores null values. So it only returns a null value if the entire group is null.
This gives you several ways to get one row per group. Returning the first row for a customer highlights an important question:
What do you do if two or more rows have the same value for the sort key?
How to deal with ties

Image by TooMuchCoffeeMan from Pixabay
So far we’ve assumed a customer can only place one order at a time. While there’s no unique constraint in the schema to guarantee this, in practice it’s unlikely a customer will place two orders at exactly the same time.
But there are some scenarios where duplicates are much more likely. For example, for each store show the three customers that placed the most orders.
To do this, you need to group by store and customer, counting how many rows there are for each:
select store_id, customer_id, count (*) num_orders from co.orders group by store_id, customer_id;
While you can return tied values with fetch first
, this option is unsupported with partition by
. So you’ll have to adapt the row_number()
subquery method shown above.
You want to number the rows for each store, sorted by the number of rows for each customer, most to least. This is:
partition by store_id
order by count (*) desc
Plugging these into the query gives:
with rws as ( select store_id, customer_id, count (*) num_orders, row_number () over ( partition by store_id order by count(*) desc ) rn from co.orders group by store_id, customer_id ) select * from rws where rn <= 3 order by store_id, rn;
It’s likely many customers will make same number of orders at each location. For example, at store 1:
- 4 customers made 8 orders
- 13 customers made 7 orders
- 26 customers made 6 orders
So which customers should you show if you’re limiting to the top three per store?
There are few ways to you could handle this:
- Always return (at most) N rows.
- If the sort has duplicates for the Nth row, return a random row
- If the sort has duplicates for the Nth row, always return the same row
- Return all the rows with the same value as the Nth
- Return all the rows with the first N values for the sorting columns
The implementation so far returns the first option: at most N rows. Whether the Nth row chosen is deterministic depends on the order by clause. If the combination of this and the partition by clause contain all the columns in a primary key or unique constraint, the output is deterministic. So running the query many times will always give you the same result.
The sort key for the subquery is (store_id, count(*)
). Clearly there’s no unique constraint on this, so two or more rows can have the same value for store and count. Thus the current implementation is non-deterministic; running this query twice on the same data might give different results.
The easiest way to make this deterministic is to add the primary key to the end of the sort in row_number
. We’re dealing with an aggregation here, so this is impossible. To ensure predictable output, you need to put all the columns of the group by
clause in the partition by
or order by
of row_number
:
with rws as ( select store_id, customer_id, count (*) num_orders, row_number () over ( partition by store_id order by count(*) desc, customer_id ) rn from co.orders group by store_id, customer_id ) select * from rws where rn <= 3 order by store_id, rn;
But what if you want to show all the customers who made the same number of orders as the Nth too?
You can do this by switching the numbering function you use. There are three available:
row_number
: Sequential values starting at one. All rows have a unique value and there are no gapsrank
: Rows with the same sort value have the same value. After ties there is a gap, with the counting starting again at the row’s position in the results. This is also known as the Olympic rankdense_rank
: As with rank, tied values have the same values. Unlike rank, there are no gaps: all numbers in a group are sequential starting at one.
Connor McDonald explains these further in this video:
How to return all rows with the same value as the Nth
You may want to show all rows tied for position N. When getting top three customers per store, this means you’ll get all four that made eight orders at store one. To do this, you want the fourth row to have the same row number as the third. The rank
function does this, so all you need to do is change row_number
to this:
with rws as ( select store_id, customer_id, count (*) num_orders, rank () over ( partition by store_id order by count(*) desc ) rn from co.orders group by store_id, customer_id ) select * from rws where rn <= 3 order by store_id, rn, customer_id;
How to return all rows with the top-N values
Sometimes you need to see all the rows that have the first N distinct values for the sort. For store one, this means all customers than placed six, seven or eight orders.
To do this you need ties to have the same row number, with no gaps in the numbering sequence. This is what dense_rank
does, so slot it in:
with rws as ( select store_id, customer_id, count (*) num_orders, dense_rank () over ( partition by store_id order by count(*) desc ) rn from co.orders group by store_id, customer_id ) select * from rws where rn <= 3 order by store_id, rn, customer_id;
Note that when using row_number
, the maximum number of rows the query can return is:
number of groups * N
For the other methods it’s possible you could fetch all the rows in the table!
Subtleties like this can be lost when you’re in a hurry. To reduce the chance of mistakes, it’d be nice if you could define a function which accepts a table, the grouping and sorting columns, and the number of rows you want from each group.
The good news is – from 19.6 – you can!
Let’s see how.
How to write a generic top-N per group function

Photo by Shahadat Rahman on Unsplash
The general form of the top-N per group query on Oracle AI Database 26ai is:
select t.* from <tablename> t order by <sorting columns> fetch first M1 partition by <grouping column1>, [ M2 partition by <grouping column2>, ] N rows only;
On earlier releases it’s
with rws as ( select t.*, row_number () over ( partition by <grouping columns> order by <sorting columns> ) rn from <tablename> t ) select * from rws where rn <= :num_rows;
SQL macros enable you to take these templates and use them with any table you want. SQL macros do this by returning the text of a query. At parse time the database merges this with calling statement to give the final query.
You can reference the function’s parameters in the string this returns. When resolving the query, the database does a find/replace of the actual text of the arguments you pass in the text of the string.
The really cool part is you can pass tables and columns as arguments. In essence this enables you to create query templates you can pass any table to.
For example, you can create a generic top-N macro like this:
create or replace function top_n ( tab dbms_tf.table_t, num_rows integer ) return varchar2 sql_macro as begin return 'select * from tab fetch first num_rows rows only'; end top_n;
At parse time the database searches for tab
and num_rows
in the return string, swapping in the text of the parameters you pass. So if you call this like so:
select * from top_n ( co.orders, :num_rows );
The database resolves it into a query like:
select * from co.orders fetch first :num_rows only;
Note that the bind variable passed is still present in the final query. It’s not evaluated or peeked in any way. Similar logic applies to functions: the name of the function itself is still in the query, not the result of executing the function.
To see what the final query is, pass it to dbms_utility.expand_sql_text
:
declare l_clob clob; begin dbms_utility.expand_sql_text ( input_sql_text => q'!select * from top_n ( co.orders, :num_rows )!', output_sql_text => l_clob ); dbms_output.put_line(l_clob); end; /
To make generic top-N/group function, take the template query above and place it in a (table) SQL macro. To allow developers to change the direction of the sort, you’ll also want to add a list of columns to order descending.
Here’s an example macro using the row_number approach:
create or replace function top_n_per_group ( tab dbms_tf.table_t, group_cols dbms_tf.columns_t, order_cols dbms_tf.columns_t, num_rows integer, desc_cols dbms_tf.columns_t default null ) return clob sql_macro as grouping_cols clob; ordering_cols clob; stmt clob; begin for col in 1 .. group_cols.count loop grouping_cols := grouping_cols || group_cols ( col ) || ','; end loop; for col in 1 .. order_cols.count loop if order_cols ( col ) member of desc_cols then ordering_cols := ordering_cols || order_cols ( col ) || 'desc,' ; else ordering_cols := ordering_cols || order_cols ( col ) || ',' ; end if; end loop; stmt := 'with rws as ( select t.*, row_number () over ( partition by ' || rtrim ( grouping_cols, ',' ) || ' order by ' || rtrim ( ordering_cols, ',' ) || ' ) rn from tab t ) select * from rws where rn <= num_rows'; return stmt; end top_n_per_group; /
Note that you can’t reference the parameters for the partition by
and order by
clauses in the string. This is because the columns
pseudo-operator returns an array. So you need to iterate through these values, building up the comma-separated list of values and paste this into the find string.
This solution also assumes that the table has no existing column named rn
. If this will be an issue for you, add another dbms_tf.columns_t
parameter to override the default.
You can now call this like so:
select * from top_n_per_group ( co.orders, columns ( customer_id ), columns ( store_id, order_tms ), 3, columns ( order_tms ) );
The table arguments don’t have to be physical tables in the schema. They could also be views or named subqueries. So you can use this function to fetch the three customers that placed the most orders at each store. Calculate this in a CTE and pass it to the function:
with customer_orders_per_store as ( select store_id, customer_id, count(*) num_orders from co.orders group by store_id, customer_id ) select * from top_n_per_group ( customer_orders_per_store, columns ( store_id ), columns ( num_orders ), 3, columns ( num_orders ) );
Or show the three most expensive line-items per customer:
with paid_per_item as ( select customer_id, unit_price * quantity total_paid from co.order_items oi join co.orders o on o.order_id = oi.order_id ) select * from top_n_per_group ( paid_per_item, columns ( customer_id ), columns ( total_paid ), 3, columns ( total_paid ) );
If you’re currently on 19c, building a SQL macro like this has a couple of benefits:
- It simplifies your top-N/group queries now
- To use the new syntax when you upgrade to 26ai, all you need to do is change your macro!
Conclusion
If you want exactly one row per group and the highest or lowest values in the ungrouped columns, a plain group by will do the job.
Oracle AI Database 26ai simplifies these queries by extending the fetch first. If you’re on older releases you need to number the rows in a subquery and filter the result.
You need to know these trick though. With SQL macros you can make this a reusable component, making it easier for other developers to solve this problem in the future.
Using macros also enables you to standardize handling of edge cases, such as dealing with ties or nulls. Or change the implementation, for example by swapping the 19c subquery solution for the AI Database fetch first method when you upgrade.
Get the code from this Live SQL script.
Want to know more about analytic functions? Take Connor’s Analytic SQL course on Oracle Dev Gym.
UPDATED: 11 Feb 2021: Changing top_n_per_group to use clob instead of varchar2; adding note about column clashing column names in this function.
UPDATED: 10 Jan 2024: Fixing video embed
UPDATED: 24 July 2024: Adding fetch first partition by syntax
UPDATED: 17 Apr 2025: Changing Live SQL links to use new platform and SQL examples to use updated CO schema
UPDATED: 20 Oct 2025: Oracle AI Database 26ai replaces Oracle Database 23ai