Home > Software design >  Powershell compare two datasets using arrays or hashtable
Powershell compare two datasets using arrays or hashtable

Time:10-02

enter image description hereUsing PowerShell I have two compare two datasets one is from CSV file and other is from a sql query result:

Sql query result:

AppName Metadata1 Metadata2
App1 ab3245 1235
App1 ab4533 4645
App2 ab1234 4101
App3 ab4245 4101
App2 ab5245 6534
App4 ab5245 2312
App5 ab3245 1235

CSV data:

Metadata1 Metadata2
ab4245 7890
ab3245 3412
ab3245 1235
ab4245 7777
ab4533 4645
ab4533 2345
ab5245 1929
ab4533 4645
ab9988 3344
ab1234 4102
ab9988 2233
ab1234 4103

I need to compare for each AppName how the metadata1 & Metadata2 values are different with CSV file?

I tried storing data in two separate arrays and then tried comparing as:

Compare-Object -ReferenceObject $sqlarray -DifferenceObject $csvarray

But above code just returns the differences between two arrays.

Updating expected output:

This is to identify any changes available in CSV file but missing in DB. Expected output for every App’s name metadata1 & metadata2 when compared with CSV file metadata1 & metadata2 only difference should display like below table:

AppName Metadata1 Metadata2
App1 ab3245 3412
App1 ab3245 1235
App1 ab4533 2345
App2 ab1234 4102
App2 ab1234 4103
App2 ab5245 1929
App3 ab4245 7890
App3 ab4245 7777
App4 ab4245 7890
App4 ab4245 7777
App5 ab3245 3412

CodePudding user response:

Work with them like .NET objects.
If you are using Invoke-SqlCmd for the SQL result, and Import-Csv for the CSV result, and the parameters are correct, you should end with two arrays of PSObjects or PSCustomObjects.
You can run a foreach loop and compare each property between the two results.

Something like this:

$sqlqueryresult = Invoke-Sqlcmd -ServerInstance 'YOURMSSQLINSTANCE.domain.com' -Database 'SomeDatabase' -Query "Select * From SomeTable Where SomeValue = 'SomeValue'"
$csvdata = Import-Csv -Path 'C:\Path\To\Your\file.csv' -Delimiter ';' # Use the delimiter you have in your CSV file

foreach ($appdata in $sqlqueryresult) {
    $csvapp = $csvdata | Where-Object { $PSItem.AppName -eq $appdata.AppName }
    if (($appdata.Metadata1 -eq $csvapp.Metadata1) -and ($appdata.Metadata2 -eq $csvapp.Metadata2)) {
        return $true
    }
    else {
        return $false
    }
}

CodePudding user response:

As per my comment, using the datasets you provided.

'Metadata1  Metadata2
ab4245  7890
ab3245  3412
ab3245  1235
ab4245  7777
ab4533  4645
ab4533  2345
ab5245  1929
ab4533  4645
ab9988  3344
ab1234  4102
ab9988  2233
ab1234  4103' | 
Out-File -FilePath 'D:\Temp\CsvDataList'


'AppName    Metadata1   Metadata2
App1    ab3245  1235
App1    ab4533  4645
App2    ab1234  4101
App3    ab4245  4101
App2    ab5245  6534
App4    ab5245  2312
App5    ab3245  1235'| 
Out-File -FilePath 'D:\Temp\SqlDataList'


   
'Metadata1  Metadata2
ab3245  1235
ab4533  4645
ab1234  4101
ab4245  4101
ab5245  6534
ab5245  2312
ab3245  1235'| 
Out-File -FilePath 'D:\Temp\SqlDataListNoAppName'

$CsvMetaList = (Import-Csv -Path 'D:\Temp\CsvDataList' -Delimiter "`t")
$SqlMetaList = (Import-Csv -Path 'D:\Temp\SqlDataList' -Delimiter "`t") 
$SqlMetaList = (Import-Csv -Path 'D:\Temp\SqlDataListNoAppName' -Delimiter "`t") 

Using Compare-Object of the arrays.

Compare-Object -ReferenceObject $CsvMetaList -DifferenceObject $SqlMetaList -IncludeEqual
# Results
<#
InputObject                         SideIndicator
-----------                         -------------
@{Metadata1=ab4245; Metadata2=7890} ==           
@{Metadata1=ab3245; Metadata2=3412} ==           
@{Metadata1=ab3245; Metadata2=1235} ==           
@{Metadata1=ab4245; Metadata2=7777} ==           
@{Metadata1=ab4533; Metadata2=4645} ==           
@{Metadata1=ab4533; Metadata2=2345} ==           
@{Metadata1=ab5245; Metadata2=1929} == 
     
@{Metadata1=ab4533; Metadata2=4645} <=           
@{Metadata1=ab9988; Metadata2=3344} <=           
@{Metadata1=ab1234; Metadata2=4102} <=           
@{Metadata1=ab9988; Metadata2=2233} <=           
@{Metadata1=ab1234; Metadata2=4103} <= 
#>

Compare-Object -ReferenceObject $SQlMetaList -DifferenceObject $CsvMetaList -IncludeEqual
# Results
<#
@{AppName=App1; Metadata1=ab3245; Metadata2=1235} ==           
@{AppName=App1; Metadata1=ab4533; Metadata2=4645} ==           
@{AppName=App2; Metadata1=ab1234; Metadata2=4101} ==           
@{AppName=App3; Metadata1=ab4245; Metadata2=4101} ==           
@{AppName=App2; Metadata1=ab5245; Metadata2=6534} ==           
@{AppName=App4; Metadata1=ab5245; Metadata2=2312} ==           
@{AppName=App5; Metadata1=ab3245; Metadata2=1235} ==
           
@{Metadata1=ab4533; Metadata2=4645}               =>           
@{Metadata1=ab9988; Metadata2=3344}               =>           
@{Metadata1=ab1234; Metadata2=4102}               =>           
@{Metadata1=ab9988; Metadata2=2233}               =>           
@{Metadata1=ab1234; Metadata2=4103}               =>
#>

Point of note: As expected, removing the AppName attribute via the SqlDataListNoAppName results are the same as the above, without the AppName of course.

Ignoring the AppName attribute.

ForEach ($CsvRecord in $CsvMetaList)
{
    If (($SqlMetaList.Metadata1 -notcontains $CsvRecord.Metadata1) -and ($SqlMetaList.Metadata2 -notcontains $CsvRecord.Metadata2))
    {$CsvRecord}
}
# Results
<#
Metadata1 Metadata2
--------- ---------
ab9988    3344     
ab9988    2233      
#>


ForEach ($CsvRecord in $CsvMetaList)
{
    If ($SqlMetaList.Metadata1 -notcontains $CsvRecord.Metadata1)
    {$CsvRecord}
}
# Results
<#
Metadata1 Metadata2
--------- ---------
ab9988    3344     
ab9988    2233  
#>


ForEach ($CsvRecord in $CsvMetaList)
{
    If ($SqlMetaList.Metadata2 -notcontains $CsvRecord.Metadata2)
    {$CsvRecord}
}
# Results
<#

Metadata1 Metadata2
--------- ---------
ab4245    7890     
ab3245    3412     
ab4245    7777     
ab4533    2345     
ab5245    1929     
ab9988    3344     
ab1234    4102     
ab9988    2233     
ab1234    4103 
#>

Combining the two array lists to map an Sql AppName to the Csv then evaluate attributes Metadata1 and Metadata12, then just Metadata1.

Comparing both attributes.

$CsvMetaList |
ForEach-Object{
    $CsvRow = $PSItem

    if(
        $SqlRow = $SqlMetaList | 
                Where-Object{($PSItem.Metadata1 -eq $CsvRow.Metadata1) -and ($PSItem.Metadata2 -eq $CsvRow.Metadata2)}
        )
    {
        (
            $out = [pscustomobject]@{ 
                        AppName   = $SqlRow.AppName
                        Metadata1 = $CsvRow.Metadata1 
                        Metadata2 = $CsvRow.Metadata2
                    }
        )
    }
} | 
Sort-Object -Property AppName, Metadata1
# Results
<#
AppName      Metadata1 Metadata2
-------      --------- ---------
App1         ab4533    4645     
App1         ab4533    4645     
{App1, App5} ab3245    1235 
#>

Comparing one attribute.

Clear-Host
$CsvMetaList |
ForEach-Object{
    $CsvRow = $PSItem

    if(
        $SqlRow = $SqlMetaList | 
                Where-Object{$PSItem.Metadata1 -eq $CsvRow.Metadata1}
        )
    {
        (
            $out = [pscustomobject]@{ 
                        AppName   = $SqlRow.AppName
                        Metadata1 = $CsvRow.Metadata1 
                        Metadata2 = $CsvRow.Metadata2
                    }
        )
    }
} | 
Sort-Object -Property AppName, Metadata1
# Results
<#
AppName      Metadata1 Metadata2
-------      --------- ---------
App1         ab4533    4645     
App1         ab4533    2345     
App1         ab4533    4645     
{App1, App5} ab3245    3412     
{App1, App5} ab3245    1235     
App2         ab1234    4102     
App2         ab1234    4103     
{App2, App4} ab5245    1929     
App3         ab4245    7890     
App3         ab4245    7777  
#>

Note: In this scenario, one gets the Metadata1 mapping as expected, but that Metadata2, is not completely valid.

Hopefully, this illustrates the points I an others are trying to make in the comments.

  • Related