I have written a script that extracts all the duplicate emails in a WS2012 AD.
The Script has all the Users with duplicate emails in a pscustomobject
Now what i want to achieve is to delete the first unique email entry and keep all the duplicate email entries so i can build dummy emails for the duplicate ones, leaving the first unique email entry alone so every user in the AD can have a unique email even if it is a dummy email.
This is an example of what the pscustomobject
exported into CSV looks like:
Name Telephone Email Department
Max Smith 12345 [email protected] Billing
Max Jones 6789 [email protected] Facility
James Adams 52585 [email protected] Import
James Jones 46844 [email protected] Service
James Bones 68315 [email protected] Management
What i need to build out of the above is:
Name Telephone Email Department
Max Jones 6789 [email protected] Facility
James Jones 46844 [email protected] Service
James Bones 68315 [email protected] Management
The first email entry is gone, all the duplicates are still there.
The dummy email would be [email protected] like [email protected]
for James Jones.
I am constantly failing to build a pscustomobject
without the first unique email and consisting of duplicates only.
I hope the Wizards of Stack Overflow can help me.
Thank You and Best Regards.
CodePudding user response:
In order to only "keep the duplicates", you need to keep track of email addresses you've already seen before.
For this, I'd recommend using a HashSet<string>
- a set only contains distinct values, and is very fast at determining whether a given value is already a member of the set in the first place - ideal for this use case.
In the following, I assume that $data
contains an array of pscustomobject
s as described in the question:
$alreadySeen = [System.Collections.Generic.HashSet[string]]::new()
$duplicatesOnly = $data |Where-Object { -not $alreadySeen.Add($_.Email) }
$duplicatesOnly |Export-Csv path\to\output.csv
The first time you add a unique value to the set, Add()
will return $true
, but subsequent attempts to add the same value will return $false
- meaning our Where-Object
filter will only filter through objects where the Email
column has already been seen at least once before.
If the emails are not uniformly cased, supply a case-insensitive string comparer when creating the hashset:
$alreadySeen = [System.Collections.Generic.HashSet[string]]::new([System.StringComparer]::InvariantCultureIgnoreCase)