Friday, November 14, 2008

Import XML File to SQL Table

Here is a brief example of importing an XML file into SQL Server table. This is accomplished by using the BULK option of OPENROWSET to load the file, and then utilizing the XQuery capabilities of SQL Server to parse the XML to normalized table format. This example requires SQL server 2005 or SQL Server 2008.

First, the following XML is saved to XML file C:\Products.xml.

<Products>

  <Product>

    <SKU>1</SKU>

    <Desc>Book</Desc>

  </Product>

  <Product>

    <SKU>2</SKU>

    <Desc>DVD</Desc>

  </Product>

  <Product>

    <SKU>3</SKU>

    <Desc>Video</Desc>

  </Product>

</Products>


Next, a table named Products is created to store the XML data.

CREATE TABLE Products(

 sku INT PRIMARY KEY,

 product_desc VARCHAR(30));


Finally, the following statement will load the XML file, parse the XML elements to columns, and insert into the Products table:

INSERT INTO Products (sku, product_desc)

SELECT X.product.query('SKU').value('.', 'INT'),

       X.product.query('Desc').value('.', 'VARCHAR(30)')

FROM (

SELECT CAST(x AS XML)

FROM OPENROWSET(

    BULK 'C:\Products.xml',

    SINGLE_BLOB) AS T(x)

    ) AS T(x)

CROSS APPLY x.nodes('Products/Product') AS X(product);


Here are the results:

SELECT sku, product_desc

FROM Products;

 

/*

 

Results:

 

sku         product_desc

----------- -------------

1           Book

2           DVD

3           Video

 

*/

99 comments:

Anonymous said...

Great article. Exactly what I was looking for. Thanks!

Anonymous said...

Been hunting for an example like this all afternoon. Thanks so much!

Anonymous said...

Very helpful, thanks! How would you handle extracting out attributes from the xml?

Plamen Ratchev said...

Here is example of extracting attributes from the XML:
http://pratchev.blogspot.com/2007/06/shredding-xml-in-sql-server-2005.html

Jake Parker said...

That's great, however one thing I really need is to have sql server just read the file and create all the necessary columns. Similar to how excel does it. For example I can open an xml file in excel and it will prompt me to view it as a table and it does all the work. Isn't there a way for sql server 2008 to do the same? I haven't found it so far, any ideas? Thanks.

Plamen Ratchev said...

Jake, you can use SELECT...INTO to create table on the fly. Have in mind this may not be a good solution because data types may not be interpreted correctly.

KIRAN said...

Its great. I got exactly what i am looking for. Thanks

Paul said...

This works, but is excruciatingly slow on my system. It takes 50 secs to extract 500 records with two fields. I've tried on another system with the same result. Why so slow?

Plamen Ratchev said...

Paul,

Did you test if the upload is slow or the shredding of XML is the problem? You can do it on two steps and see what takes most of the time. There was an issue with early versions of SQL Server 2005 where shredding of elements was slow, but that has been corrected in the latest service packs, and in SQL Server 2008/R2.

Anonymous said...

a REALLY big help, this post is.

Anonymous said...

hi
i have a bulk xml data base, i want this import in sql server table.
using procedure, how can i do this pls explain with example for sampl table


thanks
vinothraja

Plamen Ratchev said...

Hello Vinothraja,

I am not very clear what you are trying to do. If you are looking to parse the XML to normalized table format then please take a look at this article: http://pratchev.blogspot.com/2007/06/shredding-xml-in-sql-server-2005.html

Anonymous said...

hi plamen,
i send that procedure, u refer that and send to with correct format pls.

create PROCEDURE [dbo].[DIL_SALES_PIZZA]
@filepath nvarchar(100)
AS
BEGIN
SET @filepath='E:\piz.xml';
INSERT INTO pizza
(LOC_CODE,
CUST_CODE,
INV_TYPE,
INV_NO,
INV_DT,
INV_QTY_IN_INV_UOM,
INV_RATE_IN_BS_CURR,
INV_VU_IN_INV_CURR,
TOTAL_DISC_IN_BS_CURR,
NET_INV_VU_IN_INV_CURR,
NET_INV_VU_IN_BS_CURR,
TOTAL_TXES_IN_BS_CURR
)
select x.pizzaxml.query('LOC_CODE').value('.','int'),
x.pizzaxml.query('CUST_CODE').value('.','int'),
x.pizzaxml.query('INV_TYPE').value('.','nvarchar(50)'),
x.pizzaxml.query('INV_NO').value('.','nvarchar(100)'),
x.pizzaxml.query('INV_DT').value('.','datetime'),
x.pizzaxml.query('INV_QTY_IN_INV_UOM').value('.','numeric(28,8)'),
x.pizzaxml.query('INV_RATE_IN_BS_CURR').value('.','numeric(28,8)'),
x.pizzaxml.query('INV_VU_IN_INV_CURR').value('.','numeric(28,8)'),
x.pizzaxml.query('TOTAL_DISC_IN_BS_CURR').value('.','numeric(28,8)'),
x.pizzaxml.query('NET_INV_VU_IN_INV_CURR').value('.','numeric(28,8)'),
x.pizzaxml.query('NET_INV_VU_IN_BS_CURR').value('.','numeric(28,8)'),
x.pizzaxml.query('TOTAL_TXES_IN_BS_CURR').value ('.','numeric(28,8)')


FROM ( SELECT CAST(x AS XML)
FROM OPENROWSET( BULK '"+ @filepath +"',
SINGLE_BLOB) AS T(x)
) AS T(x)
CROSS APPLY x.nodes('pizza/pizzaxml') AS X(pizzaxml);

END
Vinohraja


This is my procedure, i compile this working but not execute. pls help me very urgent.

main concept is set xml data file path, if i give another file , want to execute.
pls pls pls

Anonymous said...

hi plamen,
again vinoth
the following script for another method but i am run this it shows following error(Msg 201, Level 16, State 4, Procedure sales_pizza, Line 0
Procedure or function 'sales_pizza' expects parameter '@xml', which was not supplied.
)





Declare @xml xml
set @xml ='


001
002
None
INV
C0109102246
02-Jan-2010
6.5
112.21
729.365
729.365
816.8888
87.5238000000001


001
002
None
INV
C0109102247
02-Jan-2010
5
35.861
179.305
179.305
200.8216
21.5166


001
002
None
INV
C0109102248
03-Jan-2010
7.5
35.861
268.9575
268.9575
301.2324
32.2749000000001

'


alter procedure sales_pizza
(
@xml xml
)
as
begin
INSERT INTO pizza
(LOC_CODE,
CUST_CODE,
INV_TYPE,
INV_NO,
INV_DT,
INV_QTY_IN_INV_UOM,
INV_RATE_IN_BS_CURR,
INV_VU_IN_INV_CURR,
-- TOTAL_DISC_IN_BS_CURR,
NET_INV_VU_IN_INV_CURR,
NET_INV_VU_IN_BS_CURR,
TOTAL_TXES_IN_BS_CURR
)
select
table1.column1.value('@LOC_CODE','int'),
table1.column1.value('@CUST_CODE','int'),
table1.column1.value('@INV_TYPE','nvarchar(50)'),
table1.column1.value('@INV_NO','nvarchar(100)'),
table1.column1.value('@INV_DT','datetime'),
table1.column1.value('@INV_QTY_IN_INV_UOM','numeric(28,8)'),
table1.column1.value('@INV_RATE_IN_BS_CURR','numeric(28,8)'),
table1.column1.value('@INV_VU_IN_INV_CURR','numeric(28,8)'),
--table1.column1.value('@TOTAL_DISC_IN_BS_CURR','numeric(28,8)'),
table1.column1.value('@NET_INV_VU_IN_INV_CURR','numeric(28,8)'),
table1.column1.value('@NET_INV_VU_IN_BS_CURR','numeric(28,8)'),
table1.column1.value('@TOTAL_TXES_IN_BS_CURR','numeric(28,8)')

from
@xml.nodes('pizza/pizzaxml')as table1(column1)

end



pls very urgent help me plssssss..............

Anonymous said...

hi plamen,
Sorry for disturb to u again vinoth.
very urgent for me so only i send to u,
following script i create for read xml data and to import to table. i got result for reading xml and rows are added to table but the values are showing null pls clarify this problem.

thanks


alter procedure pizza_sales
(
@xml xml
)
as
begin
INSERT INTO pizza_xml
(LOC_CODE,
CUST_CODE,
PRODUCT_ID,
INV_TYPE,
INV_NO,
INV_DT,
INV_QTY_IN_INV_UOM,
INV_RATE_IN_BS_CURR,
INV_VU_IN_INV_CURR,
-- TOTAL_DISC_IN_BS_CURR,
NET_INV_VU_IN_INV_CURR,
NET_INV_VU_IN_BS_CURR,
TOTAL_TXES_IN_BS_CURR
)
select
table1.column1.value('@LOC_CODE','nvarchar(75)'),
table1.column1.value('@CUST_CODE','nvarchar(75)'),
table1.column1.value('@PROD_CODE','nvarchar(75)'),
table1.column1.value('@INV_TYPE','nvarchar(50)'),
table1.column1.value('@INV_NO','nvarchar(100)'),
table1.column1.value('@INV_DT','datetime'),
table1.column1.value('@INV_QTY_IN_INV_UOM','numeric(28,8)'),
table1.column1.value('@INV_RATE_IN_BS_CURR','numeric(28,8)'),
table1.column1.value('@INV_VU_IN_INV_CURR','numeric(28,8)'),
--table1.column1.value('@TOTAL_DISC_IN_BS_CURR','numeric(28,8)'),
table1.column1.value('@NET_INV_VU_IN_INV_CURR','numeric(28,8)'),
table1.column1.value('@NET_INV_VU_IN_BS_CURR','numeric(28,8)'),
table1.column1.value('@TOTAL_TXES_IN_BS_CURR','numeric(28,8)')

from
@xml.nodes('pizza/pizzaxml')as table1(column1)

end


declare @idoc int
declare @doc varchar(1000)
DECLARE @FileName varchar(255)
DECLARE @ExecCmd VARCHAR(255)
DECLARE @y INT
DECLARE @x INT
DECLARE @xml VARCHAR(8000)

CREATE TABLE #temp_XML(PK INT NOT NULL IDENTITY(1,1), ThisLine VARCHAR(255))

SET @FileName = 'E:\piz.xml'
SET @ExecCmd = 'type ' + @FileName
SET @xml = ''

INSERT INTO #temp_XML EXEC master.dbo.xp_cmdshell @ExecCmd
SELECT @y = count(*) from #temp_XML

SET @x = 0
WHILE @x <> @y
BEGIN
SET @x = @x + 1
SELECT @xml = @xml + ThisLine + char(10) from #temp_XML WHERE PK = @x

END

print @xml
DROP TABLE #temp_XML

exec sp_xml_preparedocument @idoc output , @doc
exec pizza_sales @xml

XMl Reading Result:
17 rows affected
Jan-2010




1 row affected


after execute to select a table

: 12 NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL L

the records are not set showing null

pls clarify this problem.

Plamen Ratchev said...

Hello Vinothraja,

Please post a sample of your XML file. It is not easy to say what is wrong until there is XML data to test with.

Also, in the future if you need urgent help it may be best to post to the public MSDN forums:
http://social.msdn.microsoft.com:80/Forums/en-US/transactsql/threads

D3rFuc45 said...

God bless u! :D Thx

Anonymous said...

I find this query script very useful on small xml files but it seems to underperform and gets bog down with larger files. Query time was about a few sec to a min with 10-200 kb files but on larger files, say 2-10 mb or larger, the query sames to run forever. I had to kill the query after a couple of hours. Any idea on how I can get this query to run larger files?

Thanks

seal

Plamen Ratchev said...

Seal,

I do not think you can optimize this query to run faster. A better approach may be to look at using SSIS to import the Excel files: http://www.mssqltips.com/tip.asp?tip=1393

Anonymous said...

if, like me, you are looking to compare 2 XML documents whose data originated from SQL Server, and your first thought was "I'll shred the XML into a table, sort it then compare", then read on...

Use the SSIS Xml Task, Diff operation.

Very easy to use if all you want is a straight yay or nay - including various useful options such as Ignore Child order

Now if I only knew how to read the damn DiffGram it outputs...

Nayan said...

Hi I have 100 columns/tags in my XML/Table....

So in the select statement where you have two rows for each column, I cant have 100 statements like below:

X.product.query('SKU').value('.',
'INT'),

is there any other way like Select * (All)

Plamen Ratchev said...

Nayan,

There is no SELECT * shortcut, you have to list all columns explicitly.

Anonymous said...

Excelent!!! Exactly what I was looking for!!

Anonymous said...

Your sample shows how to pull data from an xml file. What is I have an XML variable, how can I pull all rows for this var?

Thanks

Plamen Ratchev said...

For extracting XML from variable please refer to this article: http://pratchev.blogspot.com/2007/06/shredding-xml-in-sql-server-2005.html

Anonymous said...

This method is way too slow. On SQL 2008 R2 it is only loading 2 recs/sec from 1,000 recs total.

RAHUL said...

I have bulk data stored in a xml file. I need to load the file in a sql column and need to run query for getting the values. How can i proceed, since im new to xml + sql it was too hard for me to continue with the xml files. I found samples like creating table and inserting datas via query but i do not see extracting values from a particular column which has xml file.

Plamen Ratchev said...

Rahul, please see my article on how to extract XML from a column and parse it to individual data elements: http://pratchev.blogspot.com/2007/06/shredding-xml-in-sql-server-2005.html

Anonymous said...

Thanks Plamen! This post was very helpful today.

Sandro said...

Hi

How i can import XML attribut like this

<*lesson_subject id="SB_DHF2010.2a_G"/>

into a table colum?

i try it like this, it dosn't work!

X.lesson.query('lesson_subject id').value('.', 'VARCHAR(100)'),

SQL server print this message:
Msg 2209, Level 16, State 1, Line 11
XQuery [T.x.query()]: Syntax error near 'lesson_subject'

can you help me?

nate808 said...
This comment has been removed by the author.
Anonymous said...

Ha Sandro:

For attributes use data(@)
so for your @id X.lesson.query('data(@id)').value('.', 'VARCHAR(100)'),

for nodes, just use
query('nodename').value

Hope this helps
Vincent
YELtv Your Emotion Live

Anonymous said...

Hi thanks...easier way to retrieve and store into db...thanks a lot!!!

Plamen Ratchev said...

Hi Sandro,

Please see my post on how to handle attributes in XML:
http://pratchev.blogspot.com/2007/06/shredding-xml-in-sql-server-2005.html

Sandro said...

Thanks Plamen and Vincent! This post was very helpful.

victor said...

Great article. but actually i was looking for something different." i want to create a Table from XML file. but i also want to create a identity column with increment of 1.". please help me. thanks in advance.

Plamen Ratchev said...

Hi Victor,

You can import the XML, then shred it (http://pratchev.blogspot.com/2007/06/shredding-xml-in-sql-server-2005.html) and insert into a table that has an IDENTITY column defined. If I understand correctly your requirements that will do it.

Jack said...

Grat article, but I have a question. Let's say I my XML file looks like this'


1
Book


2
DVD

Red
Rectangular



3
Video

ABC
XYZ




How should I modify the statement so I could insert all the xml data into the Products table? The columns of the table should be:
SKU / PRODUCT_DESC / PROPERTIES_COLOR / PROPERTIES_SHAPE / LOCATION_LOC1 / LOCATION_LOC2
And the rows in the table:
1 / Book / NULL / NULL / NULL / NULL
2 / DVD / Red / Rectangular / NULL / NULL
3 / Video / NULL / NULL / ABC / XYZ

Could you please help me with this?

Jack said...

Nodes are not visible in the previos post :(.
The xml should look like this:
<*Products*>
<*Product*>
<*SKU>1<*/SKU*>
<*Desc*>Book<*/Desc*>
<*/Product*>
<*Product*>
<*SKU*>2<*/SKU*>
<*Desc*>DVD<*/Desc*>
<*Properties*>
<*Color*>Red<*/Color*>
<*Shape*>Rectangular<*/Shape*>
<*/Properties*>
<*/Product*>
<*Product*>
<*SKU*>3<*/SKU*>
<*Desc*>Video<*/Desc*>
<*Location*>
<*Loc1*>ABC<*/Loc1*>
<*Loc2*>XYZ<*/Loc2*>
<*/Location*>
<*/Product*>
<*/Products*>

jaffa said...

Thx a lot for the article.
I tried this:
convert(datetime, x.vebra.query('deliverydate'), 126).value('.', 'datetime')
on one of the table column which is in datetime data type but got this error:
Explicit conversion from data type xml to datetime is not allowed.

Any idea on how i could import this xml and still insert a string in 'dd/mm/yyyy' format to a datetime column in my sql table.

Once again! Thx a lot

jaffa said...

Thx a lot for the article.
I tried this:
convert(datetime, x.vebra.query('deliverydate'), 126).value('.', 'datetime')
on one of the table column which is in datetime data type but got this error:
Explicit conversion from data type xml to datetime is not allowed.

Any idea on how i could import this xml and still insert a string in 'dd/mm/yyyy' format to a datetime column in my sql table.

Once again! Thx a lot

Nicholas Stanford said...
This comment has been removed by the author.
Nicholas Stanford said...

Hi Plamen

Thanks for the great article.

Do you know how to handle NULL values in an XML file when importing?

%3Camount%2F%3E

Thanks

Nick

Plamen Ratchev said...

Nick,

You can use COALESCE to convert the NULLs to some default value, like:

SELECT COALESCE(X.product.query('SKU').value('.', 'INT'), 0)...

Plamen Ratchev said...

Jaffa,

Try this:
CONVERT(DATETIME, x.vebra.query('deliverydate'), 126).value('.', 'VARCHAR(35)')

In essence extract the XML date as string and then cast it to DATETIME.

Nicholas Stanford said...

Hi Plamen

Thanks for your reply.

Sorry I was not specific enough when describing my problem.

I tried what you suggested and when importing my XML file it still fails on the NULL amount field with the following message:
'Error converting data type nvarchar to numeric.'

Here is what the field looks like in my query.
X.enquiry.query('amount').value('.', 'DECIMAL(10,2)')

The amount field in my database is DECIMAL(10,2)

If I change the amount field in my database from DECIMAL(10,2) to NVARCHAR(20) and change the following line in my query to:
X.enquiry.query('amount').value('.', 'NVARCHAR(20)') it works.

However I dont want to use my amount field as an NVARCHAR. I could always convert it (post import) but it would be much easier if I could sort everything out upon import.

Is there any way that I can get around this?

Thanks

Nick

Plamen Ratchev said...

Nick,

You can extract as NVARCHAR and then CAST:

CAST(X.enquiry.query('amount').value('.', 'NVARCHAR(20)') AS DECIMAL(10, 2))

Or:

CAST(COALESCE(X.enquiry.query('amount').value('.', 'NVARCHAR(20)'), '0') AS DECIMAL(10, 2))

Nicholas Stanford said...

Thanks for your reply Plamen but what you suggested did not help.

I still get the error message: 'Error converting data type nvarchar to numeric.'

The error seems to happen whenever I try to extract the value (when the field is NULL) and the destination data type is DECIMAL(10,2)

grant said...

Hi,

I have tried this and it works well with samples. However in my real life need the root element has an 'xmlns' attribute and despite hour and hours of Google and trial and error I cannot get this query to read the xml data - if I remove the attribute it works perfectly. Can anyone help?

Grant

Grant said...

OK - I have now learned about Namespaces and solved my problem. Here is how..

DECLARE @xml XML

SELECT @xml = BULKCOLUMN
FROM OPENROWSET(BULK 'C:\GR_Testlog.xml', SINGLE_BLOB)
AS xmlData ;WITH XMLNAMESPACES (N'http://schemas.microsoft.com/win/2004/08/events/event' as ns)

INSERT INTO Event (EventID, Task, SSID, Logon)
(
SELECT
c.value('ns:System[1]/ns:EventID[1]', 'int') AS EventID,
c.value('ns:System[1]/ns:Task[1]', 'int') AS Task,
c.value('ns:EventData[1]/ns:Data[1]', 'nvarchar(20)') AS SSID,
c.value('ns:EventData[1]/ns:Data[6]', 'nvarchar(20)') AS Logon
FROM @xml.nodes('//ns:Event') t(c)
)

Bill said...
This comment has been removed by the author.
Bill said...

Your example has been a great help, however I can't get it to work with my xml file, which is in a different format. I can import it into MS Excel and XML notepad view it correctly, but SQL is another thing. I'm receiving a xml file containing the following:-

VESSELS>
vessel TIME="2011-11-17 23:45:12 GMT" LONGITUDE="139.82357" LATITUDE="35.37236" />
vessel TIME="2011-11-17 23:45:15 GMT" LONGITUDE="7.65413" LATITUDE="47.54576" />
vessel TIME="2011-11-17 23:45:18 GMT" LONGITUDE="4.27159" LATITUDE="51.34895" />
/VESSELS>

I removed the first < from each line so the xml will display!

Any help welcome.

stedtswer said...
This comment has been removed by the author.
Samson said...

Hello! Can you help me with xml:CREATE TABLE CurrentWeather(
Location varchar (250),
[Time] varchar (250),
Wind varchar (250),
Visibility varchar (250),
Temperature varchar (250),
DewPoint varchar (250),
RelativeHumidity varchar (250),
Pressure varchar (250),
[Status] varchar (250));


INSERT INTO CurrentWeather
(Location, [Time], Wind, Visibility, Temperature, DewPoint, RelativeHumidity, Pressure, [Status])

SELECT X.[string].query('Location').value('.','VARCHAR(250)'),
X.[string].query('Time').value('.','VARCHAR(250)'),
X.[string].query('Wind').value('.','VARCHAR(250)'),
X.[string].query('Visibility').value('.','VARCHAR(250)'),
X.[string].query('Temperature').value('.','VARCHAR(250)'),
X.[string].query('DewPoint').value('.','VARCHAR(250)'),
X.[string].query('RelativeHumidity').value('.','VARCHAR(250)'),
X.[string].query('Pressure').value('.','VARCHAR(250)'),
X.[string].query('Status').value('.','VARCHAR(250)')

FROM (

SELECT CAST(x AS XML)

FROM OPENROWSET(

BULK 'C:\Airport\Weather.xml',

SINGLE_BLOB) AS T(x)

) AS T(x)

CROSS APPLY x.nodes('String/CurrentWeather') AS X(CurrentWeather);

my xml:


Berlin-Tegel, Germany (EDDT) 52-34N 013-19E 37M
Nov 08, 2011 - 05:20 PM EST / 2011.11.08 2220 UTC
from the NE (050 degrees) at 2 MPH (2 KT):0
3 mile(s):0
42 F (6 C)
39 F (4 C)
86%
30.18 in. Hg (1022 hPa)
Success

Anonymous said...

Nice. I've been looking for an example like this. Thanks :)

Cy Downez said...

many thanks, this is exactly what I was looking for

Willy said...
This comment has been removed by the author.
Willy said...
This comment has been removed by the author.
Willy said...

I have a question about how to pull just a section of the file for instance My files would be like this
<"Catelog">
<"List_Products_Types">
<"Product_types">
<"Product_type">Dolls<"/Product_type">
<"Products">
<"Product">
<"ProductID">1<"/ProductID">
<"ProductName">barbie<"/ProductName">
<"/Product">
<"Product">
<"ProductID">2<"/ProductID">
<"ProductName">ken<"/ProductName">
<"/Product">
<"/Products">
<"Product_type">Toys<"/Product_type">
<"Products")
<"Product">
<"ProductID">10<"/ProductID">
<"ProductName">wagon<"/ProductName">
<"/Product">
<"Product">
<"ProductID">20<"/ProductID">
<"ProductName">bike<"/ProductName">
<"/Product">
<"/Products">
<"/List_Products_Types">
<"/Catelog">

And I only what to pull a list of the dolls I have tried a where clause but I muse be missing somthing

Anonymous said...

I'm trying to get nested attributes to work - can anyone give me an idea?

This is the XML:
<*Document*>
<*Database name="cpData"*>
<*PortID value="4946"*>
<*ClientInformation*>
<*Resultset*>
<*Group name="GroupDataResultRoot"*>
<*Group name="GroupSameChildren"*>
<*Group name="GroupPortfolioIDIndividualGroupOnly"*>
<*Field name="Portfolio_PortfolioID">4946
<*Field name="Portfolio_PortType">Group
<*Field name="Portfolio_Objective">P56BEANFR3

This is my test query:
Use Test_KTC
Insert Into PCConfig (PortfolioID,PortType,Objective)
Select X.pcconfig.query('Portfolio_PortfolioID').value('.', 'int'),
X.pcconfig.query('Portfolio_PortType').value('.','nchar(10)'),
X.pcconfig.query('Portfolio_Objective').value('.','int')

From(
Select Cast(x AS XML)
From OPENROWSET(
Bulk 'F:\test.xml',
Single_Blob) AS T(x)
) as T(x)
CROSS APPLY x.nodes('Document/cpData/PortID/ClientInformation/Resultset/Group/Group/Group') AS X(pcconfig);

SELECT PortfolioID,PortType,Objective
From PCCONFIG

If I try to use the data(@name) as mentioned earlier (Select X.pcconfig.query('Portfolio_PortfolioID(@name)').value('.', 'int'),), and it gives me this error: XQuery [T.x.query()]: There is no function '{http://www.w3.org/2004/07/xpath-functions}:Portfolio_PortfolioID()'

The XML can't be changed and clearly I'm missing something...

Thanks!

Anonymous said...

Hi,
I am trying to load an XML File using XML Source task in SSIS.
But its throwing the following error:
[XML Source [1]] Error: The component "XML Source" (1) was unable to process the XML data. Derived types are not allowed to be declared in source Xml documents. See element "Risk" with the attribute "xsi:type=House".
I don't how to add this attribute to the XSD Schema.
Can you please help me resolve it?

Sam said...

Is it possible for this to work pulling from an http file? I need to automate this process and pull from an xml file on the web.

LadyZ said...

Very useful ! Thank you :)

Anonymous said...

It works, but I can not find any books that talk about it in more detail. I guess its something you have learn yourself from the MSDN library?

Plamen Ratchev said...

Yes, this is documented in MSDN and SQL Server BOL:
http://msdn.microsoft.com/en-us/library/ms190312.aspx
http://msdn.microsoft.com/en-us/library/ms191184.aspx

Pradeep Thorat said...
This comment has been removed by the author.
Pradeep Thorat said...

Hi,

We have xml in which we do have hierarchy like: -

topics
topic
term
subtopic
term
subtopic
term
topic
topics
What I need to do is extract data from term under topic and put it under one column and then extract data from all subtopic/term under one topic and append them (separated by |) and put them under another column.

Topic Subtopics
----- ----------


Can anyone please let us know how to achieve same with the help of Stored Procedure.
Thanks in advance.

Pradeep Thorat said...

Hi,

Ignore previous post. Here we can see data hierarchy properly.

We have xml in which we do have hierarchy like: -

topics
--topic
----term
----subtopic
------term
----subtopic
------term
--topic
topics

What I need to do is extract data from term under topic and put it under one column and then extract data from all subtopic/term under one topic and append them (separated by |) and put them under another column.

Topic Subtopics
----- ----------


Can anyone please let us know how to achieve same with the help of Stored Procedure.
Thanks in advance.

Anonymous said...

Thanks Plamen for your great article!

I appreciate your feedback on what to modify in the query to parse an XML in the following format, where node names repeat, and are distinguished by their attribute values, yet also include the inline values I want to pull:



1
Book


2
DVD


3
Video



I realize that you have addressed the extraction of attribute values in the Shredding XML article, but I couldn't locate details on how to handle my particular case.

Thanks again for all your help!
Khaled

Anonymous said...

Please find below the complete XML format I was referencing in last comment:

<*Products*>
<*Product*>
<*entry colname="SKU"*>1<*/entry*>
<*entry colname="Desc"*>Book<*/entry*>
<*/Product*>
<*Product*>
<*entry colname="SKU"*>2<*/entry*>
<*entry colname="Desc"*>DVD<*/entry*>
<*/Product*>
<*Product*>
<*entry colname="SKU"*>3<*/entry*>
<*entry colname="Desc"*>Video<*/entry*>
<*/Product*>
<*/Products*>

Thanks again!
Khaled

hussain rizvi said...

Hi I have two questions:
1. How do you change your below example into creating the table as well. I saw a post above stating select *... into, but I cant get it to work

INSERT INTO
Products
(sku, product_desc)
SELECT X.product.query('SKU').value('.', 'INT'),
X.product.query('Desc').value('.', 'VARCHAR(30)')
FROM (
SELECT CAST(x AS XML)
FROM
OPENROWSET
(
BULK 'C:\Products.xml', SINGLE_BLOB
) AS T(x)
) AS T(x)CROSS APPLY x.nodes('Products/Product') AS X(product);


2. if I have an xml file like below:

car

1
01/02/12
2
05/04/12



which as you can see has different hierarchy of data, how can the price history information be stored in another table called price_history which links to the product

Anonymous said...

Great!!! But I have some performance issues with my sqlexpress. It takes 30 seconds to import 10 records. (there are 116 fields in my table). Is it normal? Thx!!

Dhinakaran S said...

Hi ..
I have a serious issues.. we are getting data's daily in a XML format and we need to insert all these data's into SQL SERVER. The problem is, we have a column called 'Amount' and same name we are maintaining in our SQL server.

Sometimes, we are receiving XML files, with 'Amount' column name changed as 'Amt' and without checking if we load it, we are inserted with Null values.

So I need to solution, that my table should accept AMOUNT as well as AMT as column name.

Can you let me know the solution please.

Thanks and regards,
Dhinakaran

Plamen Ratchev said...

Hi Dhinakaran,

One solution is to import the XML to a staging table that has both the AMOUNT and AMT columns.Then when inserting to your production table you can use a CASE expression to pick the NON NULL value.

Dhinakaran S said...

Thank you for your kind quick response. I will make it use of it in a great way.. It helped me a lot.

Regards,
Dhinakaran

Dhinakaran S said...

Hi,

Can u let us know what can be done if I have XML file which is been created by joining 2 or more table.
In this case, How will I query to load it into SQL server database.

Regards,
Dhinakaran

Anonymous said...

Awesome, thanks for sharing the knowledge. exactly what I was looking for!

Barbacan said...

Nice post! But it's normal that for import 1680 xml nodes it takes more than 5 minutes? I read about a bug of SQL SERVER 2008 while using bulk import, that can be "solved" with OPTION (OPTIMIZE FOR( @x = NULL ))

Joe Fischer said...
This comment has been removed by the author.
Joe Fischer said...

Great Article,it got me thru most of my pain :) I have a bit of an issues with a file. I can get the data between the <> like AUTH_GIUD, but can't get the tran_nbr right after DETAIL I have an XML file similar to this:

"<"BATCH batch_id="20120713">
"<"DETAIL tran_nbr="2">
"<"AUTH_GUID>00FAXZEJ2KRR65FWFHU"<"/AUTH_GUID>
"<"AUTH_RESP>00"<"/AUTH_RESP>
"<"AUTH_CODE>000013"<"/AUTH_CODE>"

Anonymous said...

Really helpful,Thanks

Bhaumik Vyas said...

It's giving error like "Right Parenthesis missing." Now What to do ??????????

Anonymous said...

How might I insert only those entries that are not already in the table? I only want to insert those records whose id attribute is not found in the existing table.

Anonymous said...

Figured it out :
WHERE NOT EXISTS (SELECT * FROM table
WHERE column = X.product.value('@id', 'UNIQUEIDENTIFIER'))

Anonymous said...

I might be in love,,, tx

Josep said...

Hi Plamen,
I have a field in an XML file that contains HTML-like markup.

How can I setup this query so that it returns the markup?

I tried using different data types in .query('X').value('.','DATATYPE') but had no luck.

Thanks in advance.

Plamen Ratchev said...

Hi Josep,

Here is an example extracting HTML from XML file:

-- file with XML containing HTML in the elements


<1>abc]]>


-- query to import and extract

SELECT X.data.query('X').value('.', 'NVARCHAR(200)') AS data
FROM (
SELECT CAST(x AS XML)
FROM OPENROWSET(
BULK 'C:\test.xml',
SINGLE_BLOB) AS T(x)
) AS T(x)
CROSS APPLY x.nodes('XML_data') AS X(data);

-- the result

data
--------------------
<1>abc

Anonymous said...

Hi,

Since I've tried to load an XML file of 10000 records and on my machine was importing at a rate of about 200 records per minute, I decided to find out a quicker way of importing XML data into a table. The code below is working on SQL 2012 but I think it should work on 2005 and later without modifications. Also I am importing into a temporary table but you are free to import it into a regular table.

DECLARE @FileContents XML

SELECT @FileContents=BulkColumn
FROM OPENROWSET(BULK'C:\yourxmlinputfile.xml',SINGLE_BLOB) x;

SELECT
t.c.value('(SKU_id/text())[1]', 'INT') ID,
t.c.value('(Desc/text())[1]', 'VARCHAR(30)') Name
INTO #temp
FROM @FileContents.nodes('Products/Product') t(c);

SELECT * FROM #temp;

DROP TABLE #temp;

I hope you find it useful!!

George

Anonymous said...

How can i modify this script to make it work with SQL Server 2000?

Jannet Davalos said...

Thank you very much, I applied the latest recommendation and makes imports too fast!

Low-Key said...

simply awesome !!! you r a star !!

greg.fenton said...

Any comments on how to import an XML file to a set of normalized tables?

For example:

<people>
<person>
<name>Alice</name>
<city>Seattle</city>
</person>
<person>
<name>Bob</name>
<city>Seattle</city>
</person>
<person>
<name>Charlie</name>
<city>New York</city>
<person>
</people>

I want to pull this into 2 normalized tables (no duplicated rows):

city (id INT, name VARCHAR)
person (id INT, name VARCHAR, cityId INT)

Thanks!

Vojtech said...

It is indeed terribly slow, but a very small adjustment for speed up:

declare @xml table (x xml)
insert @xml
select x FROM OPENROWSET(BULK 'XXX.xml', SINGLE_BLOB) AS T(x)

and then use

SELECT ...
FROM @xml
CROSS APPLY x.nodes('....

mebel jati said...

Hi there,I enjoy reading through your article post, I wanted to write a little comment to support you and wish you a good continuationAll the best for all your blogging efforts.


Toko Furniture Online
Mebel Jepara

Anonymous said...

You are the man! Thank you!! your article is so useful.

Ben Damick said...

I'm only returning the first character of each node in my XML file with this code, using sql server 2008 and pyodbc to submit the query using python. I'm I missing something?

Satavahana G said...
This comment has been removed by the author.
Anonymous said...

I already have the XML is a column in my table. In the xml document I have example and this repeats several times, not always the same number of times. How would I go about getting each bit of text within each tag?

Anonymous said...

I already have the XML is a column in my table. In the xml document I have 'text>'example''</text' tag?