Home > Net >  parse an xml file with vba excel getting specific values to excel sheet
parse an xml file with vba excel getting specific values to excel sheet

Time:02-19

Hello I am new to programming and i'm studying it. I don't understand xml and VBA but this task has been assigned to me at work. I tried to search all over the internet to find a solution but i can't figure out because of my low knowledge.

My problem is that i need to extract specific values from a file xml in a file excel

<NODE >
    <ANOTHER-NODE>
      <XXX name1="value" name2="value2" name3="value3"/>
      <XXX name1="value" name2="value2" name3="value3"/>
      <XXX name1="value" name2="value2" name3="value3"/>
      <XXX name1="value" name2="value2" name3="value3"/>
      <XXX name1="value" name2="value2" name3="value3"/>
      <XXX name1="value" name2="value2" name3="value3"/>
    <ANOTHER-NODE2>
      <File N="xyxyxyxy" N1="xyxyxyxy1" N2="xyxyxyxy2" N3="xyxyxyxy3" N4="xyxyxyxy4"/>
      <File N="xyxyxyxy" N1="xyxyxyxy1" N2="xyxyxyxy2" N3="xyxyxyxy3" N4="xyxyxyxy4"/>
      <File N="xyxyxyxy" N1="xyxyxyxy1" N2="xyxyxyxy2" N3="xyxyxyxy3" N4="xyxyxyxy4"/>
      <File N="xyxyxyxy" N1="xyxyxyxy1" N2="xyxyxyxy2" N3="xyxyxyxy3" N4="xyxyxyxy4"/>
      <File N="xyxyxyxy" N1="xyxyxyxy1" N2="xyxyxyxy2" N3="xyxyxyxy3" N4="xyxyxyxy4"/>
      <File N="xyxyxyxy" N1="xyxyxyxy1" N2="xyxyxyxy2" N3="xyxyxyxy3" N4="xyxyxyxy4"/>
      <File N="xyxyxyxy" N1="xyxyxyxy1" N2="xyxyxyxy2" N3="xyxyxyxy3" N4="xyxyxyxy4"/>
    

This is like the structure that i have, and what i need to extract in the excel are the "name1="value" from all and the N="xyxyxyxy" from all

Private Sub CommandButtonImport_Click()

Dim xmlr As Office.FileDialog
Set xmlr = Application.FileDialog(msoFileDialogFilePicker)
With xmlr
        .Filters.Clear
        .Title = "Seleziona un File XML"
        .Filters.Add "XML File", "*.xml", 1
        .AllowMultiSelect = False
        
   If .Show = True Then
        XmlFileName = .SelectedItems(1)
        
        Dim xmlDoc As MSXML2.DOMDocument60
        
        
       Set xmlDoc = CreateObject("Msxml2.DOMDocument.6.0")
        
        If xmlDoc.Load(XmlFileName) = True Then

This is the structure that i made, to bring the xml file but i don't know if its right.

P.S this is my first post here i hope it's understandable.

UPDATE

Thanks a lot for the answers, sorry for the xml file, i made it on my own cause the original have sensitive data that i cant share, at the end i found hyour tips reaally helpfull and i wrote the code that do the work, i share

Private Sub CommandButtonImport_Click()

Dim xmlr As Office.FileDialog
Set xmlr = Application.FileDialog(msoFileDialogFilePicker)
With xmlr
        .Filters.Clear
        .Title = "Seleziona un File XML"
        .Filters.Add "XML File", "*.xml", 1
        .AllowMultiSelect = False
        
   If .Show = True Then
        XmlFileName = .SelectedItems(1)
        
        
        Dim xmlDoc As MSXML2.DOMDocument60
        Dim ECU As MSXML2.IXMLDOMNodeList
        Dim File As MSXML2.IXMLDOMNodeList
        Dim Feature As MSXML2.IXMLDOMNodeList
        
        Dim NodoLista As MSXML2.IXMLDOMNode
        Dim NodoLista1 As MSXML2.IXMLDOMNode
        Dim NodoLista2 As MSXML2.IXMLDOMNode
        
        Dim i As Integer
        Dim k As Integer
        Dim l As Integer
        
        Set xmlDoc = CreateObject("Msxml2.DOMDocument.6.0")
        
        If xmlDoc.Load(XmlFileName) = True Then

        Set ECU = xmlDoc.SelectNodes("//XXX")
        On Error Resume Next
        For Each NodoLista In ECU
        
        i = i   1
                
                With ThisWorkbook.Sheets("Foglio1").Rows(i)
                    .Cells(1).Value = NodoLista.Attributes(0).NodeValue
                End With
        Next NodoLista

        Set File = xmlDoc.SelectNodes("//File")
        On Error Resume Next
        For Each NodoLista1 In File
        
        k = k   1
                With ThisWorkbook.Sheets("Foglio1").Rows(k)
                    .Cells(3).Value = NodoLista1.Attributes(0).NodeValue
                End With
        Next NodoLista1


        Set Feature = xmlDoc.SelectNodes("//Feature")
        On Error Resume Next
        For Each NodoLista2 In Feature
        
        l = l   1
                With ThisWorkbook.Sheets("Foglio1").Rows(l)
                    .Cells(5).Value = NodoLista2.Attributes(0).NodeValue
                End With
        Next NodoLista2

        
        End If
        
   End If

End With

End Sub

If i can ask another thing i would love to update this code, doing another control on the xml, i have values on theattributes that appear more then once, there is a command line to dont print in excel the same attributes more then once ?

CodePudding user response:

There are some inconsistencies to what you've tried so far (without trying some code of your own to get xml node or attribute values); I'll answer only because the following points may help to overcome frequent starting problems before you review and re-try your own code (c.f. @TimWilliams comment above):

  • The xml structure isn't wellformed, as there are missing closing tags for <ANOTHER-NODE> and <ANOTHER-NODE2>, therefore loading would fail; that's why I inserted a TESTONLY variable to make the execution reproducable with a wellformed revised xml structure loaded as xml string (via LoadXML instead of an external file reference) - you can change that constant easily to TESTONLY=FALSE to start loading the actual file via Load referring to its filename.

  • It seems you want to use late binding of xml objects, though you use here a document declaration type Dim xmlDoc As MSXML2.DOMDocument60 which would need an explicit library reference to "Microsoft XML, v6.0" (so called early binding); in this approach, however I demonstrate late binding and declarations of objects via Dim ... As Object only (caveat: no intelliSense).

This example code assumes

  • <NODE> as so-called documentElement (~ "root" node), in any other cases it would need modifications to XPath and/or code. So I refer to DocumentElement before getting a nodelist with subordinated nodes. This is performed via a so-called XPath expression using the XMLDOM method SelectNodes.
  • Name1 and N always as first attributes to subnodes <ANOTHER-NODE> and each following sibling (here only: <ANOTHER-NODE2>). Note that XMLDOM references attributes here with zero-based indices like curNode.Attributes.Item(0).

XMLDOM (Document Object Model) is a cross-platform and language-independent interface treating the document as a tree structure and allowing programmatic access to the tree with its own methods and properties.

You can use, however the special syntax of XPath expressions like e.g. "/NODE/ANOTHER-NODE/@name1" (or even */*/@name1 applied on the assumed document element in this post) to address any logical part in the hierarchical xml document structure.

Example code

Private Sub CommandButtonImport_Click()
'...FileDialog etc.
    Dim XmlFileName
    '... further stuff      ' << insert your FileDialog code
    '...
    Dim xmlDoc As Object    ' << Late binding XML
    Set xmlDoc = CreateObject("Msxml2.DOMDocument.6.0")

    Const TESTONLY As Boolean = True             ' << Change to your needs!
    If TESTONLY Then                             ' << Only for testing xml string
        'Define wellformed xml content string(!) to make it reproducable
        Dim wellformed As String
        wellformed = getTestContent()            ' << calls help function
        If Not xmlDoc.LoadXML(wellformed) Then        ' LoadXML; Escape if Load error!
            Debug.Print "Cannot load!": Exit Sub
        End If
    Else        ' Load a file identified by FileName as in OP
        If Not xmlDoc.Load(XmlFileName) Then           ' Load; Escape if Load rror!         
            Exit Sub
        End If
    End If
    'start writing to VB Editor's immediate window
    Debug.Print "Main Node", "Subnode", "Attrib[1]", "Value" & vbNewLine & String(70, "-")
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    'Define & apply XPath expression
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Dim XPth As String
    XPth = "*/*"
    Dim SubNodes As Object
    Set SubNodes = xmlDoc.DocumentElement.SelectNodes(XPth)
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    'Loop through each Node in the referenced subnodes
    '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Dim curNode As Object
    For Each curNode In SubNodes
        Debug.Print _
        curNode.SelectSingleNode("..").nodename, _
        curNode.nodename, _
        curNode.Attributes.Item(0).Name, _
        curNode.Attributes.Item(0).Text
    Next curNode
    
End Sub

Help function `getTestContent()

Function getTestContent()
        getTestContent = _
        "<NODE>" & _
            "<ANOTHER-NODE>" & _
                "<XXX name1='valueA' name2='value2' name3='value3'/>" & _
                "<XXX name1='valueB' name2='value2' name3='value3'/>" & _
                "<XXX name1='valueC' name2='value2' name3='value3'/>" & _
            "</ANOTHER-NODE>" & _
            "<ANOTHER-NODE2>" & _
                "<File N='xyxyxyxy1' N1='xyxyxyxy1' N2='xyxyxyxy2' N3='xyxyxyxy3' N4='xyxyxyxy4'/>" & _
                "<File N='xyxyxyxy2' N1='xyxyxyxy1' N2='xyxyxyxy2' N3='xyxyxyxy3' N4='xyxyxyxy4'/>" & _
                "<File N='xyxyxyxy3' N1='xyxyxyxy1' N2='xyxyxyxy2' N3='xyxyxyxy3' N4='xyxyxyxy4'/>" & _
                "<File N='xyxyxyxy4' N1='xyxyxyxy1' N2='xyxyxyxy2' N3='xyxyxyxy3' N4='xyxyxyxy4'/>" & _
            "</ANOTHER-NODE2>" & _
        "</NODE>"
End Function

Example TESTONLY output in VB Editor's immediate window

Main Node     Subnode       Attrib[1]     Value
----------------------------------------------------------------------
ANOTHER-NODE  XXX           name1         valueA
ANOTHER-NODE  XXX           name1         valueB
ANOTHER-NODE  XXX           name1         valueC
ANOTHER-NODE2 File          N             xyxyxyxy1
ANOTHER-NODE2 File          N             xyxyxyxy2
ANOTHER-NODE2 File          N             xyxyxyxy3
ANOTHER-NODE2 File          N             xyxyxyxy4
  • Related