ThinkingCog

Articles written by Parakh Singhal

Pivoting in SQL Server

Introduction

The meaning of the word “Pivot” in the English language is “The central point, pin or shaft on which a mechanism turns or swivels.”. And that’s what the Pivot function in SQL Server does for you. It is used to convert rows to columns and in the process aggregate data.

Pivot function is used in OLAP queries to power reporting needs. Generally, you will find Pivot queries aggregating data in temporal reports with time-spanning in years or spread over months, or in reports that span a spectrum of an entity like sales regions or sales employees.

Some important notes about Pivot:

  1. Three elements form the foundation of a pivot query:
    1. The aggregating element, accepted by the pivot operator in conjunction with an aggregation function to aggregate the data. This will appear at the intersection of the rows and columns in the pivoted result-set
    2. The grouping element, which appears on rows in the pivoted result-set
    3. The spreading element, which appears as columns in the pivoted result-set
  2. Pivot requires an aggregator function along with a spreading element which would be used to spread data in the form of columns
  3. You have to know every value that would be used as a spreading element and appear as a column header
  4. Pivot uses data from a CTE (Common Table Expression) or a sub-query for processing

Let’s dive into a working example of the Pivot function.

In our example, we will create a simple table storing customer orders. Specifically, it will store customer ids, the value of the transactions they made, and the dates on which the transactions were done.

 

Create Database Temp;
Go
 
Use Temp;
Go
 
Create Table dbo.CustomerOrders
(
    Id int IDENTITY(1,1) NOT NULL,
    CustomerId int NULL,
    OrderId int NULL,
    OrderValue money NULL,
    DateOfOrder date NULL
);
Go
 
Alter Table dbo.CustomerOrders
Add Constraint PK_CustomerOrders_Id Primary Key (Id);
Go
 
Insert Into CustomerOrders 
(CustomerId, OrderId, OrderValue, DateOfOrder) 
Values
(1,210,300,'2021-01-01'),
(1,205,30,'2021-01-05'),
(1,220,200,'2021-02-01'),
(1,230,900,'2021-03-01'),
(1,240,1300,'2021-04-01'),
(1,245,6700,'2021-04-10'),
(1,250,3000,'2021-05-01'),
(1,260,100,'2021-06-01'),
(1,270,700,'2021-07-01'),
(1,275,1700,'2021-07-21'),
(1,280,1200,'2021-08-01'),
(1,290,350,'2021-09-01'),
(1,290,500,'2021-10-01'),
(1,290,600,'2021-11-01'),
(1,290,850,'2021-12-01'),
(1,295,85,'2021-12-25'),
(2,300,850,'2021-01-03'),
(2,310,85,'2021-02-25'),
(2,320,90,'2021-03-15'),
(3,330,850,'2021-02-25'),
(3,335,300,'2021-02-15'),
(3,340,80,'2021-03-29'),
(3,350,450,'2021-04-13'),
(3,360,850,'2021-05-25'),
(3,370,8500,'2021-06-23'),
(3,380,67,'2021-08-16'),
(3,390,550,'2021-12-25'),
(3,395,578,'2021-12-20'),
(4,400,850,'2021-02-25'),
(4,410,85,'2021-03-29'),
(4,420,50,'2021-05-10'),
(4,430,400,'2021-07-30'),
(4,440,900,'2021-08-19'),
(4,440,450,'2021-09-23'),
(4,450,185,'2021-10-30'),
(4,455,250,'2021-11-12'),
(4,460,25,'2021-11-29'),
(4,470,670,'2021-12-29');
Go

 

Pivoting

With the table created and data keyed in, we will now run a pivot query which will help us aggregate the data in a manner that will enable us to see the month-wise spending by each customer in a given year. If you run the CTE code in isolation then you will observe that the query gives you the result with the customer id, order value, month, and year in which the transaction was done in a columnar fashion. With the help of the pivot function, we will spread the data across months and in the process of doing so, aggregate the transactional value. So, customer id and year of transaction form our grouping elements, the month of transaction form our spreading element, and the order-value form or aggregating element.

With RawData As
(
    Select    
    CustomerId,
    OrderValue,
    Year(DateofOrder) as OrderYear,
    Month(DateOfOrder) as OrderMonth
    From CustomerOrders
)
 
Select * From RawData
Pivot(Sum(OrderValue) For OrderMonth 
in ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12])) 
as PivotData
Order By CustomerId Asc, OrderYear Asc;
Go

 

The result of the query when run would be:

Fig. 01 Pivoting Result

Fig. 01 Pivoting result

Now understand a couple of points here:

  1. We can replace the CTE with a subquery. Though, I would recommend a CTE for better clarity in your entire SQL structure.
  2. We can use all the columns returned from the CTE in our select clause of the pivot query, except for the column that needs to be aggregated. That means the "Select" clause can use the grouping and spreading elements, barring the aggregating element.
  3. We have to know the values of the spreading element. The spreading element itself is used in the “For” clause of the pivot query, and its distinct values are used in the “In” clause.
  4. We can use the pivoted columns in the “Select” clause of our pivot query, but only if we know their values and use them as column headers. Generally, to prevent the hassle, developers use the wild-card character to select all the values automatically.

The same query mentioned above can also be run as

With RawData As
(
    Select    
    CustomerId,
    OrderValue,
    Year(DateofOrder) as OrderYear,
    Month(DateOfOrder) as OrderMonth
    From CustomerOrders
)
 
Select 
CustomerId,
OrderYear,
[1] as January,
[2] as February,
[3] as March,
[4] as April,
[5] as May,
[6] as June,
[7] as July,
[8] as August,
[9] as September,
[10] as October,
[11] as November,
[12] as December
From RawData
Pivot(Sum(OrderValue) For OrderMonth 
in ([1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[11],[12])) as PivotData
Order By CustomerId, OrderYear;
Go

 

Now that we have mentioned the spreading elements with our choice of column headers, the output of the query would look like:

Fig. 02 Pivoting Result with Specific Spreading Elements

Fig. 02 Pivoting Result with Specific Spreading Elements

Here’s the general format from Microsoft that you can use to run a pivot query on your data.

SELECT <non-pivoted column>,  
    [first pivoted column] AS <column name>,  
    [second pivoted column] AS <column name>,  
    ...  
    [last pivoted column] AS <column name>  
FROM  
    (<SELECT query that produces the data>)   
    AS <alias for the source query>  
PIVOT  
(  
    <aggregation function>(<column being aggregated>)  
FOR   
[<column that contains the values that will become column headers>]   
    IN ( [first pivoted column], [second pivoted column],  
    ... [last pivoted column])  
) AS <alias for the pivot table>  
<optional ORDER BY clause>;  

 

Doing Pivoting with Dynamic SQL

The biggest challenge that you will come across when using the pivot function in SQL Server will be that many times you may come across new entrants in the spreading elements that will force you to upgrade the SQL query. An example of that is, say, a new geographical region came up that needs to be accounted for in the spreading element, or the spreading element is required to include the last 5 years and it will require an update every year as time goes by. To deal with this issue, we have to take advantage of dynamic SQL.

We will leverage a function available in SQL Server to aid in such scenarios. Note that the values of the spreading element are required to be delimited in square brackets, so if there’s any white space between the characters constituting the values of the spreading elements, it can be taken into account.

The way such a query will work would be:

1. Extract all the distinct values to be included in the spreading element and concatenate them into a SQL parameter.

2. While concatenating, use the Quotename function to encapsulate all the distinct spreading element values into square brackets used as delimiters, separated by commas.

3. Eliminate the last trailing comma from the SQL parameter with all the values for the spreading element.

4. Capture the SQL to be run into a parameter with a spreading element captured appropriately.

Example:

-- Declare the SQL parameters to capture
-- values for the spreading element and sql query
Declare 
@SpreadingElement nvarchar(max) = '',
@SQL nvarchar(max) = '';
 
-- Capture distinct spreading element values
With MonthData As
(Select distinct Month(DateofOrder) As Months From CustomerOrders)
 
Select @SpreadingElement += QUOTENAME(Months,'[]') + ',' From MonthData
 
-- Eliminating the trailing comma from the spreading element values
Set @SpreadingElement = LEFT(@SpreadingElement, Len(@SpreadingElement) - 1);
 
-- Dynamic SQL
Set @SQL = 
'With RawData As
(
    Select    
    CustomerId,
    OrderValue,
    Year(DateofOrder) as OrderYear,
    Month(DateOfOrder) as OrderMonth
    From CustomerOrders
)
 
Select 
* From RawData
Pivot(Sum(OrderValue) For OrderMonth 
in (' + @SpreadingElement + ')) as PivotData
Order By CustomerId Asc, OrderYear Asc;'
 
Execute sp_executesql @SQL;

 

The query gives the same result as given by the previous queries.

Fig. 03 Pivoting Result Dynamic SQLFig. 03 Pivoting Result Dynamic SQL

Hope this was useful.

References:

1. T-SQL Window Functions: For data analysis and beyond (Developer Reference) by Itzik Ben-Gan

2. Common Table Expressions in SQL Server

3. Window Functions in SQL Server

4. Pivot, Multi Pivot and Dynamic Pivot in SQL Server – SQLChitChat.com

blog comments powered by Disqus