Home > Software engineering >  What is a better way to write this comparative PowerShell script?
What is a better way to write this comparative PowerShell script?

Time:04-15

I'm new to PowerShell and I'm sure I'm not using best practices here. I've been working on this PowerShell script to compare two XML files. I start out by looping through each of the XML files and throwing the data into PS objects:

Here are some samples of the XML data:

XML file 1

<RESULTS>
    <ROW>
        <COLUMN NAME="ATTR1"><![CDATA[123456ABCDEF]]></COLUMN>
        <COLUMN NAME="ATTR2"><![CDATA[1.0.4.0]]></COLUMN>
        <COLUMN NAME="ATTR3"><![CDATA[Google.com]]></COLUMN>
        <COLUMN NAME="ATTR4"><![CDATA[Lorem ipsum]]></COLUMN>
        <COLUMN NAME="ATTR5"><![CDATA[This is some text]]></COLUMN>
    </ROW>
    <ROW>
        <COLUMN NAME="ATTR1"><![CDATA[123456ABCDEF]]></COLUMN>
        <COLUMN NAME="ATTR2"><![CDATA[2.0.0.1]]></COLUMN>
        <COLUMN NAME="ATTR3"><![CDATA[HelloWorld.com]]></COLUMN>
        <COLUMN NAME="ATTR4"><![CDATA[Lorem ipsum]]></COLUMN>
        <COLUMN NAME="ATTR5"><![CDATA[This is some text]]></COLUMN>
    </ROW>
    <ROW>
        <COLUMN NAME="ATTR1"><![CDATA[123456ABCDEF]]></COLUMN>
        <COLUMN NAME="ATTR2"><![CDATA[5.6.7.0]]></COLUMN>
        <COLUMN NAME="ATTR3"><![CDATA[foo_foo_6 (2).org]]></COLUMN>
        <COLUMN NAME="ATTR4"><![CDATA[Lorem ipsum]]></COLUMN>
        <COLUMN NAME="ATTR5"><![CDATA[This is some text]]></COLUMN>
    </ROW>
</RESULTS>

XML File 2

<applications xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <application>
        <name>Google.com</name>
        <version>1.2.0.0</version>
    </application>
    <application>
        <name>HelloWorld.com</name>
        <version>2.0.0.1</version>
    </application>
    <application>
        <name>FOO_FOO.org</name>
        <version>6.2.0.1</version>
    </application>
</applications>

Creating arrays/objects filled with XML data

# assign all output from `foreach` loop to `$array1` - XML file 1
$array1 = foreach($row in $xmldata1.RESULTS.ROW){
  # create new object with the pertinent details as property values
  [pscustomobject]@{
    Name = $row.COLUMN.Where{ $_.NAME -eq "ATTR3"}.'#cdata-section'
    Version = $row.COLUMN.Where{ $_.NAME -eq "ATTR2"}.'#cdata-section'
  }
}

# assign all output from `foreach` loop to `$array2` - XML file 2
$array2 = foreach($row in $xmldata2.applications.application){
  # create new object with the pertinent details as property values
  [pscustomobject]@{
    Name = $row.name
    Version = $row.version
  }
}

This is the script I'm wondering how to write more effectively. It simply loops through $array1 and compares it with the data in $array2. If there is a match in the name, and a mismatch in the version, then it will store those values in a PS object.

Script I want to improve

#loop through array 1
for($i = 0; $i -le $array1.Length; $i  )
{
    #loop through array 2
    for($j = 0; $j -le $array2.Length; $j  )
    {
        #if file name in array 1 matches a name in array 2...
        if (($array1.name[$i] -eq $array2.name[$j]) -or ($array1.name[$i].Substring(0, [Math]::Min($array1.name[$i].Length, 7)) -eq $array2.name[$j].Substring(0, [Math]::Min($array2.name[$i].Length, 7))))
        {
            #then, if that file names version does not match the version found in array 2...
            if($array1.version[$i] -ne $array2.version[$j])
            {
                #create new object        
                [pscustomobject]@{
                    Name = $array1.name[$i]
                    Name2 = $array2.name[$j]
                    Version = $array1.version[$i]
                    Version2 = $array2.version[$j]            
                }
            }
        }
    }
}

However, there are some names that don't match perfectly. So I use the -or operator and throw this line in my first if-statement to compare the first 7 characters of the file name in each array to see if there's some kind of match (which, I know there are):

($array1.name[$i].Substring(0, [Math]::Min($array1.name[$i].Length, 7)) -eq $array2.name[$j].Substring(0, [Math]::Min($array2.name[$i].Length, 7)))

Whenever I add that line though I get the following error for only some of the data objects in the arrays. The script will return some objects, but most of the time my console pane will be filled with the following error:

Error

You cannot call a method on a null-valued expression.
At line:8 char:13
          if (($array1.name[$i] -eq $array2.name[$j]) -or ($array1 ...
              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      CategoryInfo          : InvalidOperation: (:) [], RuntimeException
      FullyQualifiedErrorId : InvokeMethodOnNull

I don't even know what it's talking about. Cause when I extract that line and put actual indices in it, it works fine.

Example

if($array1.name[1020].Substring(0, [Math]::Min($array1.name[1020].Length, 7)) -eq $array2.name[2500].Substring(0, [Math]::Min($array2.name[2500].Length, 7))){

So, I'm stumped. Is there a better way to compare these two arrays and get a similar output?

CodePudding user response:

I believe this could work and might be a more direct way to do it, this method would not require you to do the object construction of the first XML. Hopefully the inline comments explains the logic.

:outer foreach($i in $xml1.results.row) {
    $name = $i.Column.Where{ $_.NAME -eq 'ATTR3' }.'#cdata-section'
    $version = $i.Column.Where{ $_.NAME -eq 'ATTR2' }.'#cdata-section'
    foreach($z in $xml2.applications.application) {
        # check if they have the same version
        $sameVersion = $version -eq $z.Version
        # check if they have the same name
        $sameName = $name -eq $z.Name
        # if both conditions are `$true` we can skip this and continue with
        # next item of outer loop
        if($sameVersion -and $sameName) {
            continue outer
        }
        # if their first 7 characters are the same but they're NOT the same version
        if([string]::new($name[0..6]) -eq [string]::new($z.Name[0..6]) -and -not $sameVersion) {
            [pscustomobject]@{
                Name     = $name
                Name2    = $z.Name
                Version  = $version
                Version2 = $z.Version
            }
        }
    }
}

The result of this would be:

Name              Name2       Version Version2
----              -----       ------- --------
Google.com        Google.com  1.0.4.0 1.2.0.0
foo_foo_6 (2).org FOO_FOO.org 5.6.7.0 6.2.0.1

See Using a labeled continue in a loop which describes and explains the use of continue outer in this example.

  • Related