Home > Back-end >  Grouping and totaling a sub-element of repeating elements in XML
Grouping and totaling a sub-element of repeating elements in XML

Time:09-17

How can I group and sum the input XML so duplicate Item elements are merged?

  • All Item elements with the same ItemType,ItemID,Color combination need to be merged into 1 output element.
  • The MinQty element (integer value) needs to be the total of all input values.

Sample input XML

<Inventory>
  <Item>
    <ItemType>P</ItemType>
    <ItemID>37494</ItemID>
    <Color>11</Color>
    <MinQty>1</MinQty>
  </Item>
  <Item>
    <ItemType>P</ItemType>
    <ItemID>50254</ItemID>
    <Color>11</Color>
    <MinQty>4</MinQty>
  </Item>
  <Item>
    <ItemType>P</ItemType>
    <ItemID>37494</ItemID>
    <Color>11</Color>
    <MinQty>2</MinQty>
  </Item>
</Inventory>

Expected output XML

<Inventory>
  <Item>
    <ItemType>P</ItemType>
    <ItemID>37494</ItemID>
    <Color>11</Color>
    <MinQty>3</MinQty>
  </Item>
  <Item>
    <ItemType>P</ItemType>
    <ItemID>50254</ItemID>
    <Color>11</Color>
    <MinQty>4</MinQty>
  </Item>
</Inventory>

Current idea is based on this answer, using Group-Object. However I don't see how to combine this with the summing of a sub-element. Remove Duplicate XML Node Groups in Powershell

CodePudding user response:

XSLT based solution via grouping.

Grouping is using the Muenchian method.

Powershell

$xslt = New-Object System.Xml.Xsl.XslCompiledTransform;
$xslt.load("e:\Temp\Powershell\Process.xslt")
$xslt.Transform("e:\Temp\Powershell\Input.xml", "e:\Temp\Powershell\Output.xml")

Input XML

<?xml version="1.0"?>
<Inventory>
    <Item>
        <ItemType>P</ItemType>
        <ItemID>37494</ItemID>
        <Color>11</Color>
        <MinQty>1</MinQty>
    </Item>
    <Item>
        <ItemType>P</ItemType>
        <ItemID>50254</ItemID>
        <Color>11</Color>
        <MinQty>4</MinQty>
    </Item>
    <Item>
        <ItemType>P</ItemType>
        <ItemID>37494</ItemID>
        <Color>11</Color>
        <MinQty>2</MinQty>
    </Item>
</Inventory>

XSLT

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes"/>

    <xsl:key name="ItemKey" match="Item" use="concat(ItemType, '||', ItemID, '||', Color)"/>

    <xsl:template match="Inventory">
        <xsl:copy>
            <xsl:for-each select="Item[generate-id(.) = generate-id(key('ItemKey', concat(ItemType, '||', ItemID, '||', Color))[1])]">
                <xsl:sort select="ItemID" data-type="text" order="ascending"/>
                <xsl:copy>
                    <xsl:copy-of select="ItemType"/>
                    <xsl:copy-of select="ItemID"/>
                    <xsl:copy-of select="Color"/>
                    <MinQty><xsl:value-of select="sum(key('ItemKey', concat(ItemType, '||', ItemID, '||', Color))/MinQty)"/></MinQty>
                </xsl:copy>
            </xsl:for-each>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>

Output XML

<?xml version='1.0' ?>
<Inventory>
  <Item>
    <ItemType>P</ItemType>
    <ItemID>37494</ItemID>
    <Color>11</Color>
    <MinQty>3</MinQty>
  </Item>
  <Item>
    <ItemType>P</ItemType>
    <ItemID>50254</ItemID>
    <Color>11</Color>
    <MinQty>4</MinQty>
  </Item>
</Inventory>

CodePudding user response:

A Group-Object solution that modifies the XML document in place:

# Parse the XML file into a DOM.
# Replace "sample.xml" with your XML file path.
($xmlDoc = [xml]::new()).Load((Convert-Path sample.xml))

# Loop over all <Item> elements.
$xmlDoc.Inventory.Item | 
  Group-Object ItemType, ItemId, Color | 
  Where-Object Count -gt 1 | 
  ForEach-Object {
    # Sum up all MinQty values and assign the sum to the first element
    # in the group.
    $_.Group[0].MinQty = ($_.Group.MinQty | Measure-Object -Sum).Sum
    # Now remove all other elements in the group.
    $null = $_.Group[1..($_.Count-1)].ForEach({ $_.ParentNode.RemoveChild($_) })
  }

# Pretty-print the modified XML document.
([System.Xml.Linq.XDocument] $xmlDoc.outerXml).ToString()

This yields the expected XML shown in your question.

  • Related