Sunday, June 10, 2007

Shredding XML in SQL Server 2005

Using XML data has many applications. In databases in particular it can be used for passing parameters from client applications, exchange data between SQL modules, or storing details in XML columns. SQL Server 2005 offers new capabilities and native support for XML. Below is one example of utilizing some of the new XML methods in XML to shred XML data to table format.

-- Declare XML variable

DECLARE @data XML;

 

-- Element-centered XML

SET @data =

N'<data>

    <customer>

      <id>1</id>

      <name>Allied Industries</name>

    </customer>

    <customer>

      <id>2</id>

      <name>Trades International</name>

    </customer>

  </data>';

 

-- Using the query() method

SELECT T.customer.query('id').value('.', 'INT') AS customer_id,

       T.customer.query('name').value('.', 'VARCHAR(20)') AS customer_name

FROM @data.nodes('data/customer') AS T(customer);

 

-- Using the value() method

SELECT T.customer.value('(id)[1]', 'INT') AS customer_id,

       T.customer.value('(name)[1]', 'VARCHAR(20)') AS customer_name

FROM @data.nodes('data/customer') AS T(customer);

 

-- Select only customer which id equals 2 using the exist() method

SELECT T.customer.value('(id)[1]', 'INT') AS customer_id,

       T.customer.value('(name)[1]', 'VARCHAR(20)') AS customer_name

FROM @data.nodes('data/customer') AS T(customer)

WHERE T.customer.exist('id/text()[. = "2"]') = 1;

 

-- Attribute-centered XML

SET @data =

N'<data>

    <customer id="1" name="Allied Industries"/>

    <customer id="2" name="Trades International"/>

  </data>';

 

-- Using the value() method

SELECT T.customer.value('@id', 'INT') AS customer_id,

       T.customer.value('@name', 'VARCHAR(20)') AS customer_name

FROM @data.nodes('data/customer') AS T(customer);

 

 

-- Results

customer_id customer_name

----------- --------------------

1           Allied Industries

2           Trades International



Testing the processing for element-centered XML versus attribute-centered XML does show that attribute-centered XML is processed faster.

Resources:

XML Support in Microsoft SQL Server 2005
http://msdn2.microsoft.com/en-us/library/ms345117.aspx

7 comments:

shh said...

This (attribute based) select statement works also.
Any signficant difference?
I stumbled upon it, converting a element based statement to attribute based.


SELECT T.customer.value('@id[1]', 'INT') AS customer_id, T.customer.value('@name[1]', 'VARCHAR(20)') AS customer_name FROM @data.nodes('data/customer') AS T(customer);


Don't forget about this Sql Server 2005 (SP2) issue.

It seems to still be an issue in 2008, despite "Tomer"'s remarks to the contrary.

shh said...

This (attribute based) select statement works also.
Any signficant difference?
I stumbled upon it, converting a element based statement to attribute based.


SELECT T.customer.value('@id[1]', 'INT') AS customer_id, T.customer.value('@name[1]', 'VARCHAR(20)') AS customer_name FROM @data.nodes('data/customer') AS T(customer);


Don't forget about this Sql Server 2005 (SP2) issue.
http://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=250407

It seems to still be an issue in 2008, despite "Tomer"'s remarks to the contrary.

Plamen Ratchev said...

Hi shh,

Although your query produces the same results, it is not technically equivalent path expression to the one I posted. The query that you posted uses the position predicate [1] to pull the first occurrence. However, in attribute-centered XML you cannot have duplicate attributes, so essentially the results will be the same.

Yes, I am aware of the feedback item posted by Erland Sommarskog and voted long time ago. Hope we see that resolved.

Scott E. Detweiler said...
This comment has been removed by the author.
Anonymous said...

Is this the best approach

CREATE TABLE [dbo].[UserAuditLog](
[UserAuditLogGUID] [varchar](128) NOT NULL,default newid()
[UserPersonGUID] [varchar](128) NOT NULL,
[UserID] [varchar](25) NOT NULL,
[UserName] [varchar](128) NOT NULL,
[RoleID] [varchar](200) NULL,
[RoleName] [varchar](4000) NULL,
[ActionTaken] [varchar](1000) NOT NULL,
[CreatedBy] [varchar](128) NOT NULL,
[CreatedByPersonGUID] [varchar](128) NOT NULL,
[DateRecorded] [datetime] NOT NULL, getdate()
) ON [PRIMARY]





CREATE PROCEDURE [dbo].[InsertUserAuditLog]
(
@UserAuditLog xml
)
AS
BEGIN
SET NOCOUNT ON
DECLARE @ErrMsg varchar(4000)
,@ErrSeverity int = 16
,@v_UserID varchar(25) = NULL
,@v_UserPersonGUID varchar(128) = NULL
,@v_UserName varchar(128) = NULL
,@v_CreatedByPersonGUID varchar(128) = NULL
,@v_CreatedBy varchar(128) = NULL
,@v_RoleID varchar(200) = NULL
,@v_RoleName varchar(2000)= NULL
,@v_ActionTaken varchar(1000)= NULL

BEGIN TRY
-- Retrieving XMLvalues into Local Variables
SELECT
@v_UserID = LogInfo.Item.query('./UserID') .value('.','varchar(25)') ,
@v_UserPersonGUID = LogInfo.Item.query('./UserPersonGUID') .value('.','varchar(128)') ,
@v_UserName = LogInfo.Item.query('./UserName') .value('.','varchar(128)'),
@v_CreatedByPersonGUID= LogInfo.Item.query('./CreatedByPersonGUID').value('.','varchar(128)'),
@v_CreatedBy = LogInfo.Item.query('./CreatedBy') .value('.','varchar(128)'),
@v_RoleID = LogInfo.Item.query('./RoleID') .value('.','varchar(200)'),
@v_RoleName = LogInfo.Item.query('./RoleName') .value('.','varchar(2000)'),
@v_ActionTaken = LogInfo.Item.query('./ActionTaken') .value('.','varchar(1000)')
FROM @UserAuditLog.nodes('/UserAuditLog') AS LogInfo(Item)

--Validating input parameters.

INSERT INTO CHR.dbo.UserAuditLog(
UserAuditLogGUID
,UserPersonGUID
,UserID
,UserName
,RoleID
,RoleName
,ActionTaken
,CreatedBy
,CreatedByPersonGUID
,DateRecorded
)
values
(@v_UserPersonGUID
,@v_UserID
,@v_UserName
,@v_RoleID
,@v_RoleName
,@v_ActionTaken
,@v_CreatedBy
,@v_CreatedByPersonGUID
)
END TRY
BEGIN CATCH
-- Raise an error with the details of the exception
SELECT @ErrMsg = ERROR_MESSAGE()
RAISERROR(@ErrMsg, @ErrSeverity, 1)
END CATCH
END

Plamen Ratchev said...

Instead of storing the XML data in variables you can directly insert it, like this:

INSERT INTO CHR.dbo.UserAuditLog(
UserPersonGUID
,UserID
,UserName
,RoleID
,RoleName
,ActionTaken
,CreatedBy
,CreatedByPersonGUID)
SELECT
LogInfo.Item.query('./UserPersonGUID') .value('.','varchar(128)') ,
LogInfo.Item.query('./UserID') .value('.','varchar(25)') ,
LogInfo.Item.query('./UserName') .value('.','varchar(128)'),
LogInfo.Item.query('./RoleID') .value('.','varchar(200)'),
LogInfo.Item.query('./RoleName') .value('.','varchar(2000)'),
LogInfo.Item.query('./ActionTaken') .value('.','varchar(1000)'),
LogInfo.Item.query('./CreatedBy') .value('.','varchar(128)'),
LogInfo.Item.query('./CreatedByPersonGUID').value('.','varchar(128)')
FROM @UserAuditLog.nodes('/UserAuditLog') AS LogInfo(Item);

Kevin Forte said...

This is such great information. Even over 6 years old this article has just made my programming life so much easier! Thank you Thank you