How can I group and sum the input XML so duplicate Item elements are merged?
- All Item elements with the same ItemType,ItemID,Color combination need to be merged into 1 output element.
- The MinQty element (integer value) needs to be the total of all input values.
Sample input XML
<Inventory>
<Item>
<ItemType>P</ItemType>
<ItemID>37494</ItemID>
<Color>11</Color>
<MinQty>1</MinQty>
</Item>
<Item>
<ItemType>P</ItemType>
<ItemID>50254</ItemID>
<Color>11</Color>
<MinQty>4</MinQty>
</Item>
<Item>
<ItemType>P</ItemType>
<ItemID>37494</ItemID>
<Color>11</Color>
<MinQty>2</MinQty>
</Item>
</Inventory>
Expected output XML
<Inventory>
<Item>
<ItemType>P</ItemType>
<ItemID>37494</ItemID>
<Color>11</Color>
<MinQty>3</MinQty>
</Item>
<Item>
<ItemType>P</ItemType>
<ItemID>50254</ItemID>
<Color>11</Color>
<MinQty>4</MinQty>
</Item>
</Inventory>
Current idea is based on this answer, using Group-Object. However I don't see how to combine this with the summing of a sub-element. Remove Duplicate XML Node Groups in Powershell
CodePudding user response:
XSLT based solution via grouping.
Grouping is using the Muenchian method.
Powershell
$xslt = New-Object System.Xml.Xsl.XslCompiledTransform;
$xslt.load("e:\Temp\Powershell\Process.xslt")
$xslt.Transform("e:\Temp\Powershell\Input.xml", "e:\Temp\Powershell\Output.xml")
Input XML
<?xml version="1.0"?>
<Inventory>
<Item>
<ItemType>P</ItemType>
<ItemID>37494</ItemID>
<Color>11</Color>
<MinQty>1</MinQty>
</Item>
<Item>
<ItemType>P</ItemType>
<ItemID>50254</ItemID>
<Color>11</Color>
<MinQty>4</MinQty>
</Item>
<Item>
<ItemType>P</ItemType>
<ItemID>37494</ItemID>
<Color>11</Color>
<MinQty>2</MinQty>
</Item>
</Inventory>
XSLT
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:key name="ItemKey" match="Item" use="concat(ItemType, '||', ItemID, '||', Color)"/>
<xsl:template match="Inventory">
<xsl:copy>
<xsl:for-each select="Item[generate-id(.) = generate-id(key('ItemKey', concat(ItemType, '||', ItemID, '||', Color))[1])]">
<xsl:sort select="ItemID" data-type="text" order="ascending"/>
<xsl:copy>
<xsl:copy-of select="ItemType"/>
<xsl:copy-of select="ItemID"/>
<xsl:copy-of select="Color"/>
<MinQty><xsl:value-of select="sum(key('ItemKey', concat(ItemType, '||', ItemID, '||', Color))/MinQty)"/></MinQty>
</xsl:copy>
</xsl:for-each>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Output XML
<?xml version='1.0' ?>
<Inventory>
<Item>
<ItemType>P</ItemType>
<ItemID>37494</ItemID>
<Color>11</Color>
<MinQty>3</MinQty>
</Item>
<Item>
<ItemType>P</ItemType>
<ItemID>50254</ItemID>
<Color>11</Color>
<MinQty>4</MinQty>
</Item>
</Inventory>
CodePudding user response:
A Group-Object
solution that modifies the XML document in place:
# Parse the XML file into a DOM.
# Replace "sample.xml" with your XML file path.
($xmlDoc = [xml]::new()).Load((Convert-Path sample.xml))
# Loop over all <Item> elements.
$xmlDoc.Inventory.Item |
Group-Object ItemType, ItemId, Color |
Where-Object Count -gt 1 |
ForEach-Object {
# Sum up all MinQty values and assign the sum to the first element
# in the group.
$_.Group[0].MinQty = ($_.Group.MinQty | Measure-Object -Sum).Sum
# Now remove all other elements in the group.
$null = $_.Group[1..($_.Count-1)].ForEach({ $_.ParentNode.RemoveChild($_) })
}
# Pretty-print the modified XML document.
([System.Xml.Linq.XDocument] $xmlDoc.outerXml).ToString()
This yields the expected XML shown in your question.