The task is to normalize the following denormalized data, which represents product sales volumes by quarter:
product_nbr qtr1 sales_qtr1 qtr2 sales_qtr2 qtr3 sales_qtr3 qtr4 sales_qtr4
----------- ------ ----------- ------ ----------- ------ ----------- ------ -----------
1 2008Q1 100 2008Q2 20 2008Q3 15 2008Q4 10
2 2008Q1 80 2008Q2 15 2008Q3 20 2008Q4 10
3 2008Q1 70 2008Q2 5 2008Q3 10 2008Q4 15
Normalized data set should look like this:
product_nbr qtr sales
----------- ------ -----------
1 2008Q1 100
1 2008Q2 20
1 2008Q3 15
1 2008Q4 10
2 2008Q1 80
2 2008Q2 15
2 2008Q3 20
2 2008Q4 10
3 2008Q1 70
3 2008Q2 5
3 2008Q3 10
3 2008Q4 15
The first method uses CROSS JOIN with table with numbers (needs one number for each quarter) and CASE expressions to select the appropriate value (quarter or sales volume) for each quarter.
SELECT product_nbr,
CASE n
WHEN 1 THEN qtr1
WHEN 2 THEN qtr2
WHEN 3 THEN qtr3
WHEN 4 THEN qtr4
END AS qtr,
CASE n
WHEN 1 THEN sales_qtr1
WHEN 2 THEN sales_qtr2
WHEN 3 THEN sales_qtr3
WHEN 4 THEN sales_qtr4
END AS sales
FROM QuarterlySales AS S
CROSS JOIN
(SELECT 1 UNION
SELECT 2 UNION
SELECT 3 UNION
SELECT 4) AS Nums(n);
Alsmost seems natural that we should be able to do the same much easier using the UNPIVOT operator (introduced in SQL Server 2005). However, one of the limitations of the UNPIVOT operator is that it works only with a single column. But because SQL Server allows multiple table operators in the FROM clause, we can use two UNPIVOT operators. The catch is that the second UNPIVOT operator applies to the virtual table results from the first unpivot operator. That requires using a little trick to extract and match the quarter from the results of each UNPIVOT operator in order to produce the final result. Here is the query to unpivot using the UNPIVOT operator.
SELECT product_nbr, qtr, sales
FROM
(SELECT product_nbr,
qtr1, sales_qtr1,
qtr2, sales_qtr2,
qtr3, sales_qtr3,
qtr4, sales_qtr4
FROM QuarterlySales) AS S
UNPIVOT
(qtr FOR qtrx IN (qtr1, qtr2, qtr3, qtr4)) AS U1
UNPIVOT
(sales FOR sales_qtrx IN (sales_qtr1, sales_qtr2,
sales_qtr3, sales_qtr4)) AS U2
WHERE RIGHT(sales_qtrx, 1) = RIGHT(qtrx, 1);

5 comments:
Thanks for the great example. I used it to unpivot 4 columns and it worked like a charm. Just added a few additional conditions in the WHERE statement:
SELECT userID, sch_code, degree, gpa, major
FROM
(SELECT userID, school_1, degree_1, gpa_1, major1,
school_2, degree_2, gpa_2, major2,
school_3, degree_3, gpa_3, major3,
school_4, degree_4, gpa_4, major4,
school_5, degree_5, gpa_5, major5,
school_6, degree_6, gpa_6, major6
FROM employee_school) p
UNPIVOT
(sch_code FOR sch_codex IN (school_1, school_2, school_3, school_4, school_5, school_6)) AS U1
UNPIVOT
(degree FOR degreex IN (degree_1, degree_2, degree_3, degree_4, degree_5, degree_6)) AS U2
UNPIVOT
(gpa FOR gpax IN (gpa_1, gpa_2, gpa_3, gpa_4, gpa_5, gpa_6)) AS U3
UNPIVOT
(major FOR majorx IN (major1, major2, major3, major4, major5,major6)) AS U4
WHERE RIGHT(sch_codex, 1) = RIGHT(degreex, 1)
AND RIGHT(sch_codex, 1) = RIGHT(gpax, 1)
AND RIGHT(sch_codex, 1) = RIGHT(majorx, 1);
GO
Hi Plamen,
Great post. Thanks a ton for the example.
Quick question...
How can I put a WHERE clause into your first non-UNPIVOT example? In my situation, I only have one value I'm trying to unpivot, (e.g. a row looks like: id, terp_id, terp2_id, terp3_id). I can't seem to get a where clause to work.
It works on you second UNPIVOT example, but I can't use this in a View, since SQL Studio tells me the UNPIVOT is unsupported in Views.
Thanks for any help!
Nick
Hi Nick,
You place the WHERE clause after the CROSS JOIN:
SELECT ...
FROM QuarterlySales AS S
CROSS JOIN (...) AS Nums(n)
WHERE ...
SELECT f.id, CASE n WHEN 1 THEN terp_id WHEN 2 THEN terp2_id WHEN 3 THEN terp3_id END AS terp
FROM dbo.requests AS f CROSS JOIN
(SELECT 1 AS Expr1
UNION ALL
SELECT 2 AS Expr1
UNION ALL
SELECT 3 AS Expr1) AS Nums(n)
WHERE f.terp IS NOT NULL
ORDER BY f.id
This gives me the error:
Invalid column name 'terp'.
I've tried many variants with no success...
You have to use a derived table:
SELECT id, terp
FROM (
SELECT f.id,
CASE n
WHEN 1 THEN terp_id
WHEN 2 THEN terp2_id
WHEN 3 THEN terp3_id
END AS terp
FROM dbo.requests AS f CROSS JOIN
(
SELECT 1 AS Expr1
UNION ALL
SELECT 2 AS Expr1
UNION ALL
SELECT 3 AS Expr1) AS Nums(n)) AS T
WHERE terp IS NOT NULL
ORDER BY id;
Post a Comment