May 26, 2016

MS SQL Server - Window Functions - Ranking Functions

A window function is one that can be applied to a set of rows, The word ‘window’ is used to refer to the set of rows that the function works on. It provides a way to work with data grouped into logical windows.

The OVER clause used to define a window or set of rows within a query result set. Then we use a window function to computes a value for each row in the subset. OVER clause can be used to compute aggregated values such as average, sum, min, max, or a top N per group results.

The window function can be of two types:

  • Ranking function
  • Aggregate function.

In this post, I will discuss Ranking Window Functions.

Depending of on whether you’re using a ranking window function or aggregate window function, the OVER clause will need to be implemented accordingly. If we use the OVER clause with a ranking function, we can specify an optional PARTITION BY clause and a required ORDER BY clause. However, if we are working with with an aggregate function, then a PARTITION BY clause must be used, and should not have an ORDER BY clause.

Working with Ranking Window Functions

Ranking functions return an integer ranking value for each row or group in a partition/window.

Transact-SQL provides the following ranking functions:

  • ROW_NUMBER
  • RANK
  • DENSE_RANK
  • NTILE

Let see each function in action with an example.

  • ROW_NUMBER:

    Returns a sequential integer number for each row in the window.

    Example: Create a sample table and insert some records to query.

    declare @tblItems table (
     ItemCode varchar(5),
     ItemCategory varchar(50),
     StockQty int
    )
    
    insert into @tblItems(ItemCode, ItemCategory, StockQty)
    values 
    ('I001', 'Books', 10),
    ('I002', 'Books', 20),
    ('I003', 'Laptops', 25),
    ('I004', 'Laptops', 30),
    ('I005', 'Laptops', 20),
    ('I006', 'SmartPhones', 50),
    ('I007', 'SmartPhones', 30),
    ('I008', 'Books', 10)
    

    The following example calculates a row number for @tblTest based on ItemCode.

    SELECT 
     Itemcode
    ,ROW_NUMBER() OVER(ORDER BY itemcode) AS 'ROW_NUMBER()'
    FROM @tblItems
    

    Here is the output:

  • RANK:

    Define an integral rank value for each row in the window. It returns a same rank value for repeated column values, however, the next number in the ranking sequence is skipped by the number of repeating rows (also known as gap). It does not always return consecutive integers.

    The following example ranks the @tblItems with sort by ItemCategory. Since the ItemCodes I001, I002, I008 have same ItemCategory, these are tied, these are ranked one, and the next rank for item I003 is returned as 4, i.e. skipped the values 2 and 3 from the previous tied items.

    SELECT 
     ItemCode
    ,ItemCategory
    ,RANK() OVER (ORDER BY ItemCategory) AS 'RANK()'
    FROM @tblItems
    

    Here is the output:

  • DENSE_RANK:

    Define an integral rank value for each row in the window, without any gaps in the ranking. It also returns a same rank value for repeated column values. If values in the ranking column are the same, they receive the same rank. The next number in the ranking sequence is the next number in the sequence. The numbers returned by the this function do not have gaps and always have consecutive ranks.

    The following example ranks the @tblItems with sort by ItemCategory. Since the ItemCodes I001, I002, I008 have same ItemCategory, these are tied, these are ranked one, and the next rank for item I003 is returned as 2, i.e. it does not skipped the gap values (2 and 3) from the previous tied items.

    SELECT 
     ItemCode
    ,ItemCategory
    ,DENSE_RANK() OVER (ORDER BY ItemCategory) AS 'DENSE_RANK()'
    FROM @tblItems
    

    Here is the output:

  • NTILE:

    Divides the result set into the number of groups specified as an argument to the function. The groups are numbered, starting at one. For each row, NTILE returns the number of the group to which the row belongs.

    The following example divides rows into 3 groups of items based on ItemCategory. Because the total number of rows is not divisible by the number of groups, the first two groups have 3 rows and the remaining groups have 2 rows each.

    SELECT 
     ItemCode
    ,ItemCategory
    ,NTILE(3) OVER(ORDER BY ItemCategory) AS 'NTILE(3)' 
    FROM @tblItems
    

    Here is the output:

    Let see another example of NTILE with group argument as 4. The following example divides rows into 4 groups of items based on ItemCategory. Now the total number of rows is divisible by the number of groups, so each group has equal number of rows (2 rows in each group).

    SELECT 
     ItemCode
    ,ItemCategory
    ,NTILE(4) OVER(ORDER BY ItemCategory) AS 'NTILE(4)' 
    FROM @tblItems
    

    Here is the output:

References:

No comments:

Post a Comment