Home > Blockchain >  PowerShell comparing two DataTables - issue with filter on two columns
PowerShell comparing two DataTables - issue with filter on two columns

Time:11-05

Issue: When I try compare two data tables against two columns (show where column A=column A, column B is not present), I can't seem to get a match.

[Sample data below]

How can I match [only the groups that both tables have in common] and show [who is missing from the matched groups]? When I try, I can only show who is missing from any of the groups (aka Charlie). I also need Bob to appear in group B.

To reproduce the issue - run this snippet in PowerShell:

## Live data table
# create "live" data:
$dtLive = New-Object System.Data.DataTable
$dtLive.Columns.Add("GroupEmailAddress") 
$dtLive.Columns.Add("MemberEmailAddress")

# add Alex to both groups, add Bob to just A-Group
$row = $dtLive.NewRow()
$row.GroupEmailAddress = "[email protected]"
$row.MemberEmailAddress = "[email protected]"
$dtLive.Rows.Add($Row)

$row = $dtLive.NewRow()
$row.GroupEmailAddress = "[email protected]"
$row.MemberEmailAddress = "[email protected]"
$dtLive.Rows.Add($Row)

$row = $dtLive.NewRow()
$row.GroupEmailAddress = "[email protected]"
$row.MemberEmailAddress = "[email protected]"
$dtLive.Rows.Add($Row)

## Updated data table
# mimic existing "live" data:
$dtUpdated = New-Object System.Data.DataTable
$dtUpdated.Columns.Add("GroupEmailAddress") 
$dtUpdated.Columns.Add("MemberEmailAddress")

# add Alex to both groups, add Bob to just A-Group
$row = $dtUpdated.NewRow()
$row.GroupEmailAddress = "[email protected]"
$row.MemberEmailAddress = "[email protected]"
$dtUpdated.Rows.Add($Row)

$row = $dtUpdated.NewRow()
$row.GroupEmailAddress = "[email protected]"
$row.MemberEmailAddress = "[email protected]"
$dtUpdated.Rows.Add($Row)

$row = $dtUpdated.NewRow()
$row.GroupEmailAddress = "[email protected]"
$row.MemberEmailAddress = "[email protected]"
$dtUpdated.Rows.Add($Row)

## NOW THE ACTUAL UPDATES
# add Bob to A-Group
$row = $dtUpdated.NewRow()
$row.GroupEmailAddress = "[email protected]"
$row.MemberEmailAddress = "[email protected]"
$dtUpdated.Rows.Add($Row)

# add new person Charlie to A-Group
$row = $dtUpdated.NewRow()
$row.GroupEmailAddress = "[email protected]"
$row.MemberEmailAddress = "[email protected]"
$dtUpdated.Rows.Add($Row)

# add new person Dan to (new) C-Group
$row = $dtUpdated.NewRow()
$row.GroupEmailAddress = "[email protected]"
$row.MemberEmailAddress = "[email protected]"
$dtUpdated.Rows.Add($Row)

This snippet shows any entries that share the same group names:

$dtUpdated | Where-Object {($_.GroupEmailAddress -in $dtLive.GroupEmailAddress)}

results:

GroupEmailAddress   MemberEmailAddress
-----------------   ------------------
[email protected] [email protected]
[email protected] [email protected]
[email protected] [email protected]
[email protected] [email protected]
[email protected] [email protected]

This snippet shows ONLY Member email addresses that are missing in the LIVE data table (irrespective of group membership):

$dtUpdated | Where-Object {($_.MemberEmailAddress -notin $dtLive.MemberEmailAddress)}

result:

GroupEmailAddress   MemberEmailAddress
-----------------   ------------------
[email protected] [email protected]
[email protected] [email protected]

Since I want to match on groups for what is missing, we don't want Dan (group C is not in the live table). We want Charlie (group A) and Bob (group B)

If I run something like this:

$dtUpdated | Where-Object {($_.GroupEmailAddress -in $dtLive.GroupEmailAddress) -and ($_.MemberEmailAddress -notin $dtLive.MemberEmailAddress)}

or

$dtUpdated | Where-Object {($_.GroupEmailAddress -in $dtLive.GroupEmailAddress)} | Where-Object {($_.MemberEmailAddress -notin $dtLive.MemberEmailAddress)}

I end up filtering to just Charlie (new to any group). How do I include that Bob in group B is missing?

Thanks!!

CodePudding user response:

If I understand correctly this should work, basically the new rows added to $dtLive using the examples tables would be:

[email protected] -> [email protected]
[email protected] -> [email protected]
[email protected] -> [email protected]

The code would be:

$map = $dtLive.Rows | Group-Object GroupEmailAddress -AsHashTable -AsString
foreach($row in $dtUpdated.Rows) {
    # if this `GroupEmailAddress` doesn't exist in `$dtLive`
    if($row.GroupEmailAddress -notin $dtLive.GroupEmailAddress) {
        $dtLive.ImportRow($row)
        continue
    }
    # if this `MemberEmailAddress` doesn't exist in the Group Of `GroupEmailAddress`
    if($row.MemberEmailAddress -notin $map[$row.GroupEmailAddress].MemberEmailAddress) {
        $dtLive.ImportRow($row)
    }
}

And the final output when inspecting $dtLive:

GroupEmailAddress   MemberEmailAddress
-----------------   ------------------
[email protected] [email protected]
[email protected] [email protected]
[email protected] [email protected]
[email protected] [email protected]
[email protected] [email protected]
[email protected] [email protected]
  • Related