Home > database >  Get a count of the duplicates in a GenericList
Get a count of the duplicates in a GenericList

Time:08-09

I seem to have the darnedest time understanding predicates in .NET methods used in PowerShell. Given...

$list = [System.Collections.Generic.List[string]]::new()
$list.AddRange([System.Collections.Generic.List[string]]@('1', '2', '3', '2', '1', '1'))

I thought $list.FindAll({$PSItem -match '1'}) would work, but no. For what it's worth, the goal is to get a count of all the duplicates. I can find the duplicates like this

$duplicates = Compare-Object $list ($list | Select-Object -Unique) | 
              Where-Object SideIndicator -eq '<=' | 
              Select-Object -ExpandProperty InputObject

But when I try to get the count like this

$duplicates.Where({$PSItem -match '1'}).Count

I am off by one. I am thinking that is because Compare-Object is comparing the full list to the unique list, which is just 1,2,3 and not counting the first one as a duplicate? So .Count 1 would be an option?

Anyway, two questions really. What does my predicate need to be for .FindAll()? And is .Count 1 the best approach if using the Compare-Object approach?

To some extent I am trying to understand my options that don't involve the pipeline, but when there is no alternative using the pipeline is OK.

EDIT: So, a little more searching led me to add a Group-Object to get counts in $duplicates, then a Sort-Object since Group-Object is default sorted on the count. So, this gets me the ultimate output I want.

$duplicates = Compare-Object $list ($list | Select-Object -Unique) | 
              Where-Object SideIndicator -eq '<=' | 
              Select-Object -ExpandProperty InputObject |
              Group-Object -noelement |
              Sort-Object  -Property Name
$duplicatesForLog = $duplicates.Foreach({"$($PSItem.Name)($($PSItem.Count 1))"})
$duplicatesForLog -join ', '

CodePudding user response:

Abraham Zinala correctly points out that Group-Object is sufficient to find duplicates:

(
  $list |
    Group-Object |
    Where-Object Count -gt 1 |
    ForEach-Object { '{0}({1})' -f $_.Name, $_.Count }
) -join ', '

As for:

What does my predicate need to be for .FindAll()

.FindAll() - unlike the intrinsic .Where() method - is a type-native method of System.Collections.Generic.List`1, which knows nothing about pipeline-related automatic PowerShell variables such as $_ aka $PSItem.

The script block that is being used as a System.Predicate[string] instance in this case receives its argument - each element of the list - as an argument. As usual, you can refer to this argument via the automatic $args variable or by formally declaring a parameter to bind the argument to. Thus, the following two solutions are equivalent:

[System.Collections.Generic.List[string]] $list = @('1', '2', '3', '2', '1', '1')

# Use $args
$list.FindAll({ $args -eq '1' })

# Use a declared parameter
$list.FindAll({ param([string] $obj) $obj -eq '1' })

Both .FindAll() calls return [System.Collections.Generic.List[string]] (1, 1, 1)

  • Related