Home > OS >  SQL Server - parse GPX file
SQL Server - parse GPX file

Time:08-04

I'm trying to parse a GPX file within SQL Server 2019, but I'm hitting a snag with namespaces, I think.

From what I can see - if the GPX file contains :

xmlns="http://www.topografix.com/GPX/1/1"

SQL returns a NULL. But if I remove that from the GPX file, the SQL returns a string of coords - as expected.

SQL code :

DECLARE @XML TABLE (XML_COLUMN XML)

DECLARE @sqlstmt NVARCHAR(255)
DECLARE @file NVARCHAR(255) = 'd:\demo_2.gpx'

SET @sqlstmt= 'SELECT * FROM OPENROWSET (BULK '''   @file   ''', SINGLE_CLOB) AS xmlData'

INSERT INTO @XML 
    EXEC (@sqlstmt)

;WITH XMLNAMESPACES ('http://www.topografix.com/GPX/1/1' AS ns), X_CTE AS
(
    SELECT
        T1.Name.query('.') AS Name,
        T2.X_Content.query('.') AS X_Content
    FROM   
        @XML AS X
    CROSS APPLY 
        XML_Column.nodes('/gpx/trk') AS T1(Name)
    CROSS APPLY 
        XML_Column.nodes('/gpx/trk/trkseg/trkpt') AS T2(X_Content)
),
XML_Data AS
(
    SELECT
        X_Content.value('(/trkpt/@lat)[1]', 'VARCHAR(50)') AS LAT,
        X_Content.value('(/trkpt/@lon)[1]', 'VARCHAR(50)') AS LON
    FROM 
        X_CTE
)
SELECT 
    STUFF((SELECT '['   LON   ','   LAT   ']'   ','
           FROM XML_Data
           WHERE 1 = 1
           FOR XML PATH('')), 1, 0, '') AS mapString;

GPX file content (demo_2.gpx)

<?xml version="1.0" encoding="UTF-8"?>
<gpx creator="Garmin Connect" version="1.1"
  xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/11.xsd"
  xmlns:ns3="http://www.garmin.com/xmlschemas/TrackPointExtension/v1"
  xmlns="http://www.topografix.com/GPX/1/1"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ns2="http://www.garmin.com/xmlschemas/GpxExtensions/v3">
  <metadata>
    <link href="connect.garmin.com">
      <text>Garmin Connect</text>
    </link>
    <time>2022-05-29T08:37:21.000Z</time>
  </metadata>
  <trk>
    <name>My Route</name>
    <type>e_bike_mountain</type>
    <trkseg>
      <trkpt lat="54.37033147551119327545166015625" lon="-3.075514398515224456787109375">
        <ele>65.8000030517578125</ele>
        <time>2022-05-29T11:37:02.000Z</time>
        <extensions>
          <ns3:TrackPointExtension>
            <ns3:atemp>17.0</ns3:atemp>
            <ns3:hr>155</ns3:hr>
          </ns3:TrackPointExtension>
        </extensions>
      </trkpt>
      <trkpt lat="54.37033147551119327545166015625" lon="-3.075514398515224456787109375">
        <ele>65.8000030517578125</ele>
        <time>2022-05-29T11:37:03.000Z</time>
        <extensions>
          <ns3:TrackPointExtension>
            <ns3:atemp>17.0</ns3:atemp>
            <ns3:hr>155</ns3:hr>
          </ns3:TrackPointExtension>
        </extensions>
      </trkpt>
    </trkseg>
  </trk>
</gpx>

Really pulling the last bits of remaining hair out with this one, if anyone can assist, that would be totally awesome!

CodePudding user response:

Your XML defines a default namespace, that is applied to all XML nodes - as long as no other namespace is defined explicitly, by means of a prefix:

<gpx creator="Garmin Connect" version="1.1"
     ...
     xmlns="http://www.topografix.com/GPX/1/1"

The xmlns= declaration, without an alias (like xmlns:ns=...), is the default XML namespace for your XML document.

Now if you actually define your XML namespace in the query, with an alias ns like so:

;WITH XMLNAMESPACES ('http://www.topografix.com/GPX/1/1' AS ns)

then you must also use that alias in all your relevant XPath queries:

SELECT
    ...
FROM   
    @XML AS X
CROSS APPLY 
    XML_Column.nodes('/ns:gpx/ns:trk') AS T1(Name)
CROSS APPLY 
    XML_Column.nodes('/ns:gpx/ns:trk/ns:trkseg/ns:trkpt') AS T2(X_Content)

Alternatively, and much simpler - define the XML namespace as your default namespace in the XQuery in T-SQL; then you do not need to apply the namespace alias everywhere:

;WITH XMLNAMESPACES (DEFAULT 'http://www.topografix.com/GPX/1/1')

And in the end - you could write your XQuery much simpler - try this:

WITH XMLNAMESPACES (DEFAULT 'http://www.topografix.com/GPX/1/1')
    SELECT
        LAT = XC.value('@lat', 'VARCHAR(50)'),
        LON = XC.value('@lon', 'VARCHAR(50)')
    FROM   
        @Xml AS X
    CROSS APPLY 
        XML_Column.nodes('/gpx/trk/trkseg/trkpt') AS XT(XC)

This should return the same value - with much less code and indirections....

CodePudding user response:

Sorted, with many thanks to marc_s - my final working code ...

DECLARE @XML TABLE (XML_COLUMN XML)

    DECLARE @sqlstmt NVARCHAR(255)
    DECLARE @file NVARCHAR(255) = 'd:\erc\gpx\demo_4.gpx'

    SET @sqlstmt= 'SELECT * FROM OPENROWSET ( BULK '''   @file   ''', SINGLE_CLOB) AS xmlData'
 
INSERT INTO @XML 
    EXEC (@sqlstmt)

;WITH XMLNAMESPACES (DEFAULT 'http://www.topografix.com/GPX/1/1'), X_CTE AS
    (   
    SELECT
        LAT = XC.value('@lat', 'VARCHAR(50)'),
        LON = XC.value('@lon', 'VARCHAR(50)')
    FROM   
        @XML AS X
    CROSS APPLY 
        XML_Column.nodes('/gpx/trk/trkseg/trkpt') AS XT(XC)
        ),
XML_Data AS
(
    SELECT * FROM X_CTE
)
    SELECT Stuff
    ((
        SELECT
        '['   LON   ','   LAT   ']'   ','
        FROM XML_Data
        WHERE 1 = 1
    FOR XML PATH('')), 1, 0, '') AS mapString;
  • Related