Ranking functions in SQL

Ranking functions introduced with SQL Server 2005 allow us to assign a number to each row returned from a query. They allow us to rank each row in respect to others in several different ways. Ranking functions can be used only with the SELECT and ORDER BY statements. They cannot be used directly in a WHERE or GROUP BY clause, but can be used in a CTE or derived table.

The syntax for ranking functions is shown as follows:

<function_name>() OVER([PARTITION BY <partition_by_list>]

ORDER BY <order_by_list>)

function_name: Can be one of ROW_NUMBER, RANK, DENSE_RANK, and NTILE
OVER: Defines the details of how the ranking should order or split the data
PARTITION BY: Details which data the column should use as the basis of the splits
ORDER BY: Details the ordering of the data

There are four ranking functions in SQL Server:

ROW_NUMBER – It allows us to provide incrementing sequential integer values to the rows in the result-set of a query based on logical order that is specified in the ORDER BY subclause of the OVER clause. The ROW_NUMBER function contains the OVER clause, which the function uses to determine the numbering behavior. The ORDER BY option, which determines the order in which the function applies the numbers, must be included in the query. We have the option of starting the numbers over whenever the values of a specified column change, called partitioning, with the PARTITION BY clause.

Example Query 1: Basic use of ROW_NUMBER()

SELECT CustomerID, FirstName + ' ' + LastName AS Name,

ROW_NUMBER() OVER (ORDER BY LastName, FirstName) AS Row

FROM Sales.Customer AS c

INNER JOIN Person.Person AS p ON c.PersonID = p.BusinessEntityID;

Example Query 2: Using ROW_NUMBER() in a CTE

WITH customers AS(
SELECT CustomerID, Name, Row

FROM customers

WHERE Row > 50

ORDER BY Row;

Example Query 3: Using PARTITION BY option in ROW_NUMBER()

SELECT CustomerID, FirstName + ' ' + LastName AS Name, c.TerritoryID,

ROW_NUMBER() OVER (PARTITION BY c.TerritoryID ORDER BY LastName, FirstName) AS Row

FROM Sales.Customer AS c

INNER JOIN Person.Person AS p ON c.PersonID = p.BusinessEntityID;
RANK – RANK assigns the same number to the duplicate rows and skips numbers not used. It assigns an ascending, nonunique ranking number to a set of rows, giving the same number to duplicate rows; numbers are skipped for the number of rows that have the same value.

If rows 2 and 3 are duplicates, RANK will supply the values 1, 3, 3, and 4.

Example Query 1: Using RANK

SELECT ROW_NUMBER() OVER (PARTITION BY SUBSTRING(LastName,1,2) ORDER BY LastName) AS RowNum,

RANK() OVER(ORDER BY SUBSTRING(LastName,1,2) ) AS Ranking, CONCAT(FirstName,' ',LastName) AS CustomerName, UnclearedBalance

FROM CustomerDetails.Customers

WHERE UnclearedBalance is not null

ORDER BY Ranking
DENSE_RANK - This is similar to RANK, but each row number returned will be one greater than the previous setting, no matter how many rows are the same. DENSE_RANK doesn’t skip numbers.

If rows 2 and 3 are duplicates, DENSE_RANK will supply the values 1, 2, 2, and 3.

Example Query:

SELECT ROW_NUMBER() OVER (PARTITION BY SUBSTRING(LastName,1,2) ORDER BY LastName) AS RowNum,

DENSE_RANK() OVER(ORDER BY SUBSTRING(LastName,1,2) ) AS Ranking, CONCAT(FirstName,'',LastName) AS CustomerName, UnclearedBalance

FROM CustomerDetails.Customers

WHERE UnclearedBalance is not null

ORDER BY Ranking

Note: Both RANK and DENSE_RANK are similar to the ROW_NUMBER function, but they produce the same ranking value in all rows that have the same logical ordering value.

The difference between RANK and DENSE_RANK is that RANK indicates how many rows have a lower ordering value, whereas DENSE_RANK indicates how many distinct ordering values are lower.

For example, a rank of 9 indicates eight rows with lower values. A dense rank of 9 indicates eight distinct lower values.
NTILE – The NTILE function assigns buckets to groups of rows. It allows us to associate the rows in the result with tiles (equally sized groups of rows) by assigning a tile number to each row. This takes the rows from the query and places them into an equal (or as close to equal as possible) number of specified numbered groups, where NTILE returns the group number the row belongs to. The value in parentheses after NTILE defines the number of groups to produce, so NTILE(25) would produce 25 groups of as close a split as possible of even numbers.

Example Query:

SELECT NTILE(10) OVER (ORDER BY LastName) AS BatchNumber, CONCAT(FirstName,' ',LastName) AS CustomerName, UnclearedBalance

FROM CustomerDetails.Customers

WHERE UnclearedBalance is not null

ORDER BY BatchNumber

The following SQL query uses all the ranking functions in a single query:

SELECT orderid, custid, val,

ROW_NUMBER() OVER(ORDER BY val) AS rownum,

RANK() OVER(ORDER BY val) AS rank,

DENSE_RANK() OVER(ORDER BY val) AS dense_rank,

NTILE(100) OVER(ORDER BY val) AS ntile

FROM Sales.OrderValues

ORDER BY val;

Pages

Ranking functions in SQL

No comments:

Post a Comment