[Oracle] On ROWNUM and Limiting Results

This is a very useful time-saving and essential knowlege when query Oracle

Source: http://www.oracle.com/technetwork/issue-archive/2006/06-sep/o56asktom-086197.html

ROWNUM is a magic column in Oracle Database that gets many people into trouble. When you learn what it is and how it works, however, it can be very useful. I use it for two main things: > To perform top-Nprocessing. This is similar to using the LIMIT clause, available in some other databases. > To paginate through a query, typically in a stateless environment such as the Web. I use this technique on the asktom.oracle.com Web site.

ROWNUM is a pseudocolumn (not a real column) that is available in a query. ROWNUM will be assigned the numbers 1, 2, 3, 4, …N, whereNis the number of rows in the set ROWNUM is used with.

Also confusing to many people is when a ROWNUM value is actually assigned. A ROWNUM value is assigned to a row after it passes the predicate phase of the query but before the query does any sorting or aggregation. Also, a ROWNUM value is incremented only after it is assigned, which is why the following query will never return a row: select *    from t  where ROWNUM > 1;

 Because ROWNUM > 1 is not true for the first row, ROWNUM does not advance to 2. Hence, no ROWNUM value ever gets to be greater than 1. Consider a query with this structure: select …, ROWNUM   from t  where <where clause>  group by <columns> having <having clause>  order by <columns>;

 Think of it as being processed in this order:

1.The FROM/WHERE clause goes first. 2.ROWNUM is assigned and incremented to each output row from the FROM/WHERE clause. 3.SELECT is applied. 4.GROUP BY is applied. 5.HAVING is applied. 6.ORDER BY is applied.

That is why a query in the following form is almost certainly an error: select *   from emp  where ROWNUM <= 5  order by sal desc;

The intention was most likely to get the five highest-paid people—a top-Nquery. What the query will return is five random records (the first five the query happens to hit), sorted by salary

* Top- N Query Processing with ROWNUM *

In a top-Nquery, you are generally interested in taking some complex query, sorting it, and then retrieving just the firstNrows (the topNrows). ROWNUM has a top-Noptimization that facilitates this type of query. You can use ROWNUM to avoid a massive sort of large sets

select *   from (select *    from t   order by unindexed_column)  where ROWNUM < :N;

 In this case, Oracle Database will take these steps:

1.Run a full-table scan on T, as before (you cannot avoid this step). 2.In an array of :Nelements (presumably in memory this time), sort only :Nrows.

The firstNrows will populate this array of rows in sorted order. When theN+1 row is fetched, it will be compared to the last row in the array. If it would go into slotN+1 in the array, it gets thrown out. Otherwise, it is added to this array and sorted and one of the existing rows is discarded. Your sort area holdsNrows maximum, so instead of sorting one million rows, you sortNrows.

This seemingly small detail of using an array concept and sorting justNrows can lead to huge gains in performance and resource usage. It takes a lot less RAM to sort 10 rows than it does to sort one million rows (not to mention TEMP space usage).

* Pagination with ROWNUM *

My all-time-favorite use of ROWNUM is pagination. In this case, I use ROWNUM to get rowsNthrough M of a result set. The general form is as follows: select *   from ( select /*+ FIRST_ROWS(n) */   a.*, ROWNUM rnum       from ( your_query_goes_here,       with order by ) a       where ROWNUM <=       :MAX_ROW_TO_FETCH ) where rnum  >= :MIN_ROW_TO_FETCH;

where

FIRST_ROWS(N) tells the optimizer, “Hey, I’m interested in getting the first rows, and I’ll get Nof them as fast as possible.”

:MAX_ROW_TO_FETCH is set to the last row of the result set to fetch—if you wanted rows 50 to 60 of the result set, you would set this to 60.

:MIN_ROW_TO_FETCH is set to the first row of the result set to fetch, so to get rows 50 to 60, you would set this to 50.

The concept behind this scenario is that an end user with a Web browser has done a search and is waiting for the results. It is imperative to return the first result page (and second page, and so on) as fast as possible. If you look at that query closely, you’ll notice that it incorporates a top-Nquery (get the first :MAX_ROW_TO_FETCH rows from your query) and hence benefits from the top-Nquery optimization I just described. Further, it returns over the network to the client only the specific rows of interest—it removes any leading rows from the result set that are not of interest.

One important thing about using this pagination query is that the ORDER BY statement should order by something unique. If what you are ordering by is not unique, you should add something to the end of the ORDER BY to make it so.

In this case, just use ROWID. ROWID is unique within a table, so if you use ORDER BY ID and then within ID you use ORDER BY ROWID, the rows will have a definite, deterministic order and the pagination query will deterministically return the rows as expected

 

One comment

  1. I couldn’t refrain from commenting. Well written!

Leave a comment