Ranking
functions introduced with SQL Server 2005 allow us to assign a
number to each row returned from a query. They allow us to rank each
row in respect to others in several different ways. Ranking functions
can be used only with the SELECT and ORDER BY statements. They
cannot be used directly in a WHERE or GROUP BY clause, but can be
used in a CTE or derived table.
The
syntax for ranking functions is shown as follows:
<function_name>()
OVER([PARTITION BY <partition_by_list>]
ORDER
BY <order_by_list>)
- function_name: Can be one of ROW_NUMBER, RANK, DENSE_RANK, and NTILE
- OVER: Defines the details of how the ranking should order or split the data
- PARTITION BY: Details which data the column should use as the basis of the splits
- ORDER BY: Details the ordering of the data
There
are four ranking functions in SQL Server:
- ROW_NUMBER – It allows us to provide incrementing sequential integer values to the rows in the result-set of a query based on logical order that is specified in the ORDER BY subclause of the OVER clause. The ROW_NUMBER function contains the OVER clause, which the function uses to determine the numbering behavior. The ORDER BY option, which determines the order in which the function applies the numbers, must be included in the query. We have the option of starting the numbers over whenever the values of a specified column change, called partitioning, with the PARTITION BY clause.Example Query 1: Basic use of ROW_NUMBER()SELECT CustomerID, FirstName + ' ' + LastName AS Name,ROW_NUMBER() OVER (ORDER BY LastName, FirstName) AS RowFROM Sales.Customer AS cINNER JOIN Person.Person AS p ON c.PersonID = p.BusinessEntityID;Example Query 2: Using ROW_NUMBER() in a CTEWITH customers AS(SELECT CustomerID, FirstName + ' ' + LastName AS Name,ROW_NUMBER() OVER (ORDER BY LastName, FirstName) AS RowFROM Sales.Customer AS cINNER JOIN Person.Person AS p ON c.PersonID = p.BusinessEntityID)SELECT CustomerID, Name, RowFROM customersWHERE Row > 50ORDER BY Row;Example Query 3: Using PARTITION BY option in ROW_NUMBER()SELECT CustomerID, FirstName + ' ' + LastName AS Name, c.TerritoryID,ROW_NUMBER() OVER (PARTITION BY c.TerritoryID ORDER BY LastName, FirstName) AS RowFROM Sales.Customer AS cINNER JOIN Person.Person AS p ON c.PersonID = p.BusinessEntityID;
- RANK – RANK assigns the same number to the duplicate rows and skips numbers not used. It assigns an ascending, nonunique ranking number to a set of rows, giving the same number to duplicate rows; numbers are skipped for the number of rows that have the same value.If rows 2 and 3 are duplicates, RANK will supply the values 1, 3, 3, and 4.Example Query 1: Using RANKSELECT ROW_NUMBER() OVER (PARTITION BY SUBSTRING(LastName,1,2) ORDER BY LastName) AS RowNum,RANK() OVER(ORDER BY SUBSTRING(LastName,1,2) ) AS Ranking, CONCAT(FirstName,' ',LastName) AS CustomerName, UnclearedBalanceFROM CustomerDetails.CustomersWHERE UnclearedBalance is not nullORDER BY Ranking
- DENSE_RANK - This is similar to RANK, but each row number returned will be one greater than the previous setting, no matter how many rows are the same. DENSE_RANK doesn’t skip numbers.If rows 2 and 3 are duplicates, DENSE_RANK will supply the values 1, 2, 2, and 3.Example Query:SELECT ROW_NUMBER() OVER (PARTITION BY SUBSTRING(LastName,1,2) ORDER BY LastName) AS RowNum,DENSE_RANK() OVER(ORDER BY SUBSTRING(LastName,1,2) ) AS Ranking, CONCAT(FirstName,'',LastName) AS CustomerName, UnclearedBalanceFROM CustomerDetails.CustomersWHERE UnclearedBalance is not nullORDER BY RankingNote: Both RANK and DENSE_RANK are similar to the ROW_NUMBER function, but they produce the same ranking value in all rows that have the same logical ordering value.The difference between RANK and DENSE_RANK is that RANK indicates how many rows have a lower ordering value, whereas DENSE_RANK indicates how many distinct ordering values are lower.For example, a rank of 9 indicates eight rows with lower values. A dense rank of 9 indicates eight distinct lower values.
- NTILE – The NTILE function assigns buckets to groups of rows. It allows us to associate the rows in the result with tiles (equally sized groups of rows) by assigning a tile number to each row. This takes the rows from the query and places them into an equal (or as close to equal as possible) number of specified numbered groups, where NTILE returns the group number the row belongs to. The value in parentheses after NTILE defines the number of groups to produce, so NTILE(25) would produce 25 groups of as close a split as possible of even numbers.Example Query:SELECT NTILE(10) OVER (ORDER BY LastName) AS BatchNumber, CONCAT(FirstName,' ',LastName) AS CustomerName, UnclearedBalanceFROM CustomerDetails.CustomersWHERE UnclearedBalance is not nullORDER BY BatchNumber
The
following SQL query uses all the ranking functions in a single query:
SELECT
orderid, custid, val,
ROW_NUMBER()
OVER(ORDER BY val) AS rownum,
RANK()
OVER(ORDER BY val) AS rank,
DENSE_RANK()
OVER(ORDER BY val) AS dense_rank,
NTILE(100)
OVER(ORDER BY val) AS ntile
FROM
Sales.OrderValues
ORDER
BY val;
No comments:
Post a Comment