Saturday, February 28, 2009

Unpivoting Multiple Columns

Unpivoting is the process of normalizing data. In earlier post I discussed unpivoting a single column. This demonstration is to show how to unpivot multiple columns.

The task is to normalize the following denormalized data, which represents product sales volumes by quarter:

product_nbr qtr1   sales_qtr1  qtr2   sales_qtr2  qtr3   sales_qtr3  qtr4   sales_qtr4

----------- ------ ----------- ------ ----------- ------ ----------- ------ -----------

1           2008Q1 100         2008Q2 20          2008Q3 15          2008Q4 10

2           2008Q1 80          2008Q2 15          2008Q3 20          2008Q4 10

3           2008Q1 70          2008Q2 5           2008Q3 10          2008Q4 15

Normalized data set should look like this:

product_nbr qtr    sales

----------- ------ -----------

1           2008Q1 100

1           2008Q2 20

1           2008Q3 15

1           2008Q4 10

2           2008Q1 80

2           2008Q2 15

2           2008Q3 20

2           2008Q4 10

3           2008Q1 70

3           2008Q2 5

3           2008Q3 10

3           2008Q4 15

The first method uses CROSS JOIN with table with numbers (needs one number for each quarter) and CASE expressions to select the appropriate value (quarter or sales volume) for each quarter.

SELECT product_nbr

       
CASE n

            
WHEN THEN qtr1

            
WHEN THEN qtr2

            
WHEN THEN qtr3

            
WHEN THEN qtr4

       
END AS qtr,

       
CASE n

            
WHEN THEN sales_qtr1

            
WHEN THEN sales_qtr2

            
WHEN THEN sales_qtr3

            
WHEN THEN sales_qtr4

       
END AS sales

FROM QuarterlySales AS S

CROSS JOIN 

(SELECT UNION 

 SELECT 
UNION

 SELECT 
UNION

 SELECT 
4AS Nums(n);

Alsmost seems natural that we should be able to do the same much easier using the UNPIVOT operator (introduced in SQL Server 2005). However, one of the limitations of the UNPIVOT operator is that it works only with a single column. But because SQL Server allows multiple table operators in the FROM clause, we can use two UNPIVOT operators. The catch is that the second UNPIVOT operator applies to the virtual table results from the first unpivot operator. That requires using a little trick to extract and match the quarter from the results of each UNPIVOT operator in order to produce the final result. Here is the query to unpivot using the UNPIVOT operator.

SELECT product_nbrqtrsales

FROM 

 
(SELECT product_nbr

         
qtr1sales_qtr1

         
qtr2sales_qtr2

         
qtr3sales_qtr3,

         
qtr4sales_qtr4

  
FROM QuarterlySalesAS S

UNPIVOT

 (qtr FOR qtrx IN (qtr1qtr2qtr3qtr4)) AS U1

UNPIVOT

 (sales FOR sales_qtrx IN (sales_qtr1sales_qtr2

                           
sales_qtr3sales_qtr4)) AS U2

WHERE RIGHT(sales_qtrx1RIGHT(qtrx1);

15 comments:

steve said...

Thanks for the great example. I used it to unpivot 4 columns and it worked like a charm. Just added a few additional conditions in the WHERE statement:

SELECT userID, sch_code, degree, gpa, major
FROM
(SELECT userID, school_1, degree_1, gpa_1, major1,
school_2, degree_2, gpa_2, major2,
school_3, degree_3, gpa_3, major3,
school_4, degree_4, gpa_4, major4,
school_5, degree_5, gpa_5, major5,
school_6, degree_6, gpa_6, major6

FROM employee_school) p
UNPIVOT
(sch_code FOR sch_codex IN (school_1, school_2, school_3, school_4, school_5, school_6)) AS U1

UNPIVOT
(degree FOR degreex IN (degree_1, degree_2, degree_3, degree_4, degree_5, degree_6)) AS U2

UNPIVOT
(gpa FOR gpax IN (gpa_1, gpa_2, gpa_3, gpa_4, gpa_5, gpa_6)) AS U3

UNPIVOT
(major FOR majorx IN (major1, major2, major3, major4, major5,major6)) AS U4

WHERE RIGHT(sch_codex, 1) = RIGHT(degreex, 1)

AND RIGHT(sch_codex, 1) = RIGHT(gpax, 1)

AND RIGHT(sch_codex, 1) = RIGHT(majorx, 1);
GO

Anonymous said...

Hi Plamen,

Great post. Thanks a ton for the example.

Quick question...

How can I put a WHERE clause into your first non-UNPIVOT example? In my situation, I only have one value I'm trying to unpivot, (e.g. a row looks like: id, terp_id, terp2_id, terp3_id). I can't seem to get a where clause to work.

It works on you second UNPIVOT example, but I can't use this in a View, since SQL Studio tells me the UNPIVOT is unsupported in Views.

Thanks for any help!

Nick

Plamen Ratchev said...

Hi Nick,

You place the WHERE clause after the CROSS JOIN:

SELECT ...
FROM QuarterlySales AS S
CROSS JOIN (...) AS Nums(n)
WHERE ...

Anonymous said...

SELECT f.id, CASE n WHEN 1 THEN terp_id WHEN 2 THEN terp2_id WHEN 3 THEN terp3_id END AS terp
FROM dbo.requests AS f CROSS JOIN
(SELECT 1 AS Expr1
UNION ALL
SELECT 2 AS Expr1
UNION ALL
SELECT 3 AS Expr1) AS Nums(n)
WHERE f.terp IS NOT NULL
ORDER BY f.id

This gives me the error:

Invalid column name 'terp'.

I've tried many variants with no success...

Plamen Ratchev said...

You have to use a derived table:

SELECT id, terp
FROM (
SELECT f.id,
CASE n
WHEN 1 THEN terp_id
WHEN 2 THEN terp2_id
WHEN 3 THEN terp3_id
END AS terp
FROM dbo.requests AS f CROSS JOIN
(
SELECT 1 AS Expr1
UNION ALL
SELECT 2 AS Expr1
UNION ALL
SELECT 3 AS Expr1) AS Nums(n)) AS T
WHERE terp IS NOT NULL
ORDER BY id;

Mohammad said...

Also a simple solution with grouping sets:

SELECT product_nbr,
COALESCE(sales_qtr1, sales_qtr1 ,sales_qtr1, sales_qtr1) AS qtr,
COALESCE(qtr1, qtr2, qtr3, qtr4) AS sales
FROM QuarterlySales
GROUP BY GROUPING SETS
(
(product_nbr, sales_qtr1, qtr1),
(product_nbr, sales_qtr2, qtr2),
(product_nbr, sales_qtr3, qtr3),
(product_nbr, sales_qtr4, qtr4)
);

Mohammad said...

Avoid UNPIVOT!
Seems in this case this solution is more efficient than the multiple UNPIVOT.

SELECT product_nbr, qtr, sales
FROM QuarterlySales
CROSS APPLY
(SELECT qtr1, sales_qtr1
UNION ALL
SELECT qtr2, sales_qtr2
UNION ALL
SELECT qtr3, sales_qtr3
UNION ALL
SELECT qtr4, sales_qtr4
) D(qtr, sales);

You can execute following statements for testing.
CREATE TABLE QuarterlySales
(product_nbr INTEGER NOT NULL PRIMARY KEY,
qtr1 CHAR(6),
sales_qtr1 INT,
qtr2 CHAR(6),
sales_qtr2 INT,
qtr3 CHAR(6),
sales_qtr3 INT,
qtr4 CHAR(6),
sales_qtr4 INT);

INSERT INTO dbo.QuarterlySales (product_nbr, qtr1, sales_qtr1, qtr2, sales_qtr2, qtr3, sales_qtr3, qtr4, sales_qtr4)
VALUES
(1, '2008Q1', 100, '2008Q2', 20, '2008Q3', 15, '2008Q4', 10),
(2, '2008Q1', 80, '2008Q2', 15, '2008Q3', 20, '2008Q4', 10),
(3, '2008Q1', 70, '2008Q2', 5, '2008Q3', 10, '2008Q4', 15);

Mohammad said...

No need multiple UNPIVOT. Here is my solution, mixture of traditional and native.

SELECT product_nbr,
qtr,
CASE qtrx WHEN 'qtr1'
THEN sales_qtr1
WHEN 'qtr2'
THEN sales_qtr2
WHEN 'qtr3'
THEN sales_qtr3
WHEN 'qtr4'
THEN sales_qtr4
END AS sales
FROM QuarterlySales
UNPIVOT
(qtr FOR qtrx IN (qtr1, qtr2, qtr3, qtr4)) AS U1

Mohammad said...

Here is simplified version of grouping set with using grouping sets algebra:

SELECT product_nbr,
COALESCE(sales_qtr1, sales_qtr2 ,sales_qtr3, sales_qtr4) AS qtr,
COALESCE(qtr1, qtr2, qtr3, qtr4) AS sales
FROM QuarterlySales
GROUP BY GROUPING SETS
(
(product_nbr)
),
GROUPING SETS
(
(sales_qtr1, qtr1),
(sales_qtr2, qtr2),
(sales_qtr3, qtr3),
(sales_qtr4, qtr4)
);

Mohammad said...

Also another native approach no traditional, just one UNPIVOT operator. (This solution will work when data type of column in unpivot list be same)
I want to know what your opinion about this is. Is this practical or not?

SELECT product_nbr,
MAX(CASE WHEN LEFT(attribute, 1) = 's' THEN value END) AS sales,
MAX(CASE WHEN LEFT(attribute, 1) = 'q' THEN value END) AS qtr
FROM QuarterlySales
UNPIVOT
(value FOR attribute IN (sales_qtr1, sales_qtr2, sales_qtr3, sales_qtr4, qtr1, qtr2, qtr3, qtr4)) AS U1
GROUP BY product_nbr, RIGHT(attribute, 1)
ORDER BY product_nbr, RIGHT(attribute, 1);

Plamen Ratchev said...

Hi Mohammad,

I have used this approach on a limited scale. As you noted the data type compatibility between columns can be an issue. Other than that it can be handy and efficient solution.

Mohammad said...

Here id simplified hybrid version.

SELECT product_nbr,
qtr,
PARSENAME(sales_qtr4 + '.' + sales_qtr3 + '.' + sales_qtr2 + '.' + sales_qtr1, RIGHT(qtrx, 1) * 1) AS sales
FROM QuarterlySales
UNPIVOT
(qtr FOR qtrx IN (qtr1, qtr2, qtr3, qtr4)) AS U1

Mohammad Salimabadi said...

Hi Plamen,
The multiple UNPIVOT is really nice and wonderful but it is so complex and difficult for understanding its logic for me. So I was looking for a method with just single UNPIVOT, at last I found it. Concatenating and splitting is my new approach.
select product_nbr,
parsename(value, 2) as qtr,
parsename(value, 1) * 1 as sales
from (select
product_nbr,
grp_1 = qtr1 +'.'+ cast(sales_qtr1 as varchar(5)),
grp_2 = qtr2 +'.'+ cast(sales_qtr2 as varchar(5)),
grp_3 = qtr3 +'.'+ cast(sales_qtr3 as varchar(5)),
grp_4 = qtr4 +'.'+ cast(sales_qtr4 as varchar(5))
from QuarterlySales) d
unpivot
(value for attibute in ([grp_1], [grp_2], [grp_3], [grp_4])
) u

solomish said...

First, I'd like to say thank you for your post. I used a portion of your cross join code, specifically the "select from union select from union..." and received the results I needed however, would like to "insert into" these results into a separate table. I am using this in an Access 2007 database and was wondering if you have experience with inserting the results of the "select from union" query into a separate table.

Alexander said...

Thank you for the post. Could you give an advice. If I have the same problem, but I have not 4 pairs as in the example, but 25 pairs to unpivot.
Moreover, I have to do group by first of the resulting column of the dynamic table and then order by sum(second resulting column)

Also initial table contains more then 1 000 000 of rows. Is it good scenario for this task ? SQL has to manage group by and order by on the huge dynamic table without indexes.

May be it is exist another approach to get same result ?