I've created a script to delete duplicates in large mailboxes.
This mailboxes have many duplicates since we did import of mail archives into those mailboxes.
For that, i want to use group-object cmdlet, to collect all duplicates, than remain only 1 in each group.
But running this line on folder of 50k items (i have mailboxes with folders of 120k items), producing error - "Insufficient memory to continue the execution of the program."
I don't have to bring here the entire script. i tried just those line in the code below, and i got the error after few minuts.
Details:
The command:
$user = '[email protected]'
$outlook = New-Object -com Outlook.Application
$namespace = $outlook.GetNamespace("MAPI")
$mailbox = $namespace.Stores | ? {$_.displayname -like $user}
$global:mailboxRoot = $mailbox.GetRootFolder()
$bb = $mailboxRoot.Folders[1].Items | Group-Object -Property senton, subject
Machine Memory: 32GB
Error output while: outlook memory around 520-550 MB
WSManConfig: Microsoft.WSMan.Management\WSMan::localhost\Shell
Type Name SourceOfValue Value
---- ---- ------------- -----
System.String AllowRemoteShellAccess GPO true
System.String IdleTimeout 7200000
System.String MaxConcurrentUsers 2147483647
System.String MaxShellRunTime 2147483647
System.String MaxProcessesPerShell 2147483647
System.String MaxMemoryPerShellMB 2147483647
System.String MaxShellsPerUser 2147483647
CodePudding user response:
The error you received is generated by Outlook, not powershell. The memory limits of Outlook depend on the installed version (2013, O365, 2007, etc.) and whether it is 32-bit or 64-bit.
One problem you've got is that you're storing the entire message item (including the body) in memory. Try selecting only the properties you need - for example:
$inbox = $mailboxRoot.Folders[2].Items | select -Property SentOn,Subject,EntryID
Then group them after the query is done, saving only groups with duplicates:
$grouped = $inbox | Group-Object SentOn,subject | Where-Object Count -gt 1
For some reason, getting the items and grouping them as one pipeline seemed to cause outlook to balloon in memory usage - maybe it keeps something open too long?
And when you've completed your task for that mailbox, it's probably worth closing outlook before starting a new search:
$outlook.Quit()
Get-Process 'outlook' | Stop-Process
This saved me an enormous amount of memory. I saw no real memory usage increase during the script run of my 10k items inbox folder, though it did take a good 20 minutes...
I'll note that the sent time is only accurate to the second. On my 10k items, I had 70 "duplicates" (automated reports basically. Same subject same send time). Maybe that's fine for what you're doing, but it's worth being careful of.
CodePudding user response:
Since my goal is to delete those messages, i'm afraid (correct me if i'm wrong) that finding duplicates from $inbox will not delete the real mail, just the thin copy with the few attributes which placed in $inbox.
In this case, i'll have to find the real message mutch the thin copy, and delete it. considering folder with 100K items, it can take few minutes to find each duplicated mail, like this:
$duplicates = $folder.Items | ?{$_.senton -eq $x.SentOn} | ?{$_.subject -eq $x.Subject}
if (($duplicates | ?{$_.MessageClass -EQ "IPM.Note.EnterpriseVault.Shortcut"}) -ne $null) {
if (($duplicates | ?{$_.MessageClass -EQ "IPM.Note.EnterpriseVault.Shortcut"}).count -gt 1) {
($duplicates | ?{$_.MessageClass -EQ "IPM.Note.EnterpriseVault.Shortcut"})[-1].delete()
} else {
($duplicates | ?{$_.MessageClass -EQ "IPM.Note.EnterpriseVault.Shortcut"}).delete()
}
} else {
$duplicates[-1].delete()
}
This is my current code, and i find out that to delete like 10-20k duplicates from each mailbox, can take 2 week. maibe more.
Besides, there is no way to increase the memory available to outlook - if this is the program that running out of memory? I can allocate 25-28 GB of memory for that.