Home > Back-end >  XML file to PostgreSQL with <row> tag
XML file to PostgreSQL with <row> tag

Time:05-22

I'm trying to import data from an XML file to PostgreSQL. I would like to create columns from the attributes in the <row> tag. The problem is when I use xpath to get text, I get nothing.

Here is a sample input testinput.xml file

<tags>
  <row Id="1" TagName=".net" Count="316293" ExcerptPostId="3624959" WikiPostId="3607476" />
  <row Id="2" TagName="html" Count="1116853" ExcerptPostId="3673183" WikiPostId="3673182" />
  <row Id="3" TagName="javascript" Count="2343663" ExcerptPostId="3624960" WikiPostId="3607052" />
</tags>

Here is my myquery.sql

DO $$

DECLARE xml_string xml;

BEGIN
xml_string := XMLPARSE(DOCUMENT convert_from(pg_read_binary_file('/home/me/data/testinput.xml'), 'UTF8'));

DROP TABLE IF EXISTS tags;

CREATE TABLE tags AS
SELECT
    (xpath('//ID', x))[1]::text as id,
    (xpath('//TagName', x))[1]::text as tagName,
    (xpath('//Count', x))[1]::text as count,
    (xpath('//ExcerptPostId', x))[1]::text as excerptPostId,
    (xpath('//WikiPostId', x))[1]::text as wikiPostId
FROM unnest(xpath('//row', xml_string)) x;
END$$;

SELECT * FROM tags;

I got this

 id | tagname | count | excerptpostid | wikipostid 
---- --------- ------- --------------- ------------
    |         |       |               | 
    |         |       |               | 
    |         |       |               | 
(3 rows)

By the way, my .xml file is around 100GB, so any fast solution would help. Thank you!

CodePudding user response:

Because the values you intend to parse are attributes (not elements), you need to parse with @ keyword:

SELECT
    (xpath('//@Id', x))[1]::text as id,
    (xpath('//@TagName', x))[1]::text as tagName,
    (xpath('//@Count', x))[1]::text as count,
    (xpath('//@ExcerptPostId', x))[1]::text as excerptPostId,
    (xpath('//@WikiPostId', x))[1]::text as wikiPostId
FROM unnest(xpath('//row', xml_string)) x;

Online Demo

  • Related