Home > Software design >  Keep the largest subfolder by deleting subfolders having the same number of identical words except f
Keep the largest subfolder by deleting subfolders having the same number of identical words except f

Time:01-09

I have a folders like

carlo
  |
   --- Stephen King
            |
             -- 22_11_63 (19989)
                22_11_63 (39129)
                22_11_63 (59966)
            |
             -- IT (589)
                IT (2893)
    
  |
   -- A. Ribaudo
           |
            -- Guidare sicuri (3859)
               Guidare sicuri (39068)

For example the size of 22_11_63 (19989) has 2 mb, the size of folder 22_11_63 (39129) has 1 mb and last is 500 kb. Same thing for A. Ribaudo, each subfolder has a different size.
I want only the subfolder that has the largest size to be kept

I use this script but I only can move missing subfolders from a parent folder2 (sergio) to another parent folder1 (carlo).

    # Set the path to the two folders to compare
    $folder1 = "C:\Users\Administrator\Desktop\confronto\carlo"
    $folder2 = "C:\Users\Administrator\Desktop\confronto\sergio"
    
    # Get the list of folders in each folder
    $folders1 = Get-ChildItem -Path $folder1 -Directory
    $folders2 = Get-ChildItem -Path $folder2 -Directory
    
    # Loop through the folders in $folder2
    $folders2 | ForEach-Object {
        $folder2Name = $_.Name
        $folder2Path = $_.FullName
        $folder1Path = Join-Path -Path $folder1 -ChildPath $folder2Name
    
        # Check if the folder exists in $folder1
        if (Test-Path -Path $folder1Path) {
            # The folder exists in both folders, so check if it has the same subfolders
            $subfolders2 = Get-ChildItem -Path $folder2Path -Directory
            $subfolders1 = Get-ChildItem -Path $folder1Path -Directory
            $missingIn1 = Compare-Object -ReferenceObject $subfolders1 -DifferenceObject $subfolders2 -Property Name -PassThru
    
            # Move the missing subfolders to $folder1
            $missingIn1 | ForEach-Object {
                $subfolderSource = $_.FullName
                $subfolderDest = Join-Path -Path $folder1Path -ChildPath $_.Name
                Copy-Item -Path $subfolderSource -Destination $subfolderDest -Recurse
            }
    
            
    
    # Remove duplicate subfolders from $folder1
$subfolders1 = Get-ChildItem -Path $folder1Path -Directory
$subfoldersToKeep = @()
$subfolders1 | Group-Object -Property Name | ForEach-Object {
    $subfoldersGroup = $_.Group
    $largestSubfolder = $subfoldersGroup | Sort-Object -Property Length -Descending | Select-Object -First 1
    $subfoldersToKeep  = $largestSubfolder
}
$subfolders1 | Where-Object { $subfoldersToKeep -notcontains $_ } | Remove-Item -Recurse
    
        }
    } 

This is the final output that I expect

carlo
  |
   --- Stephen King
            |
             -- 22_11_63 (19989)
                
            |
             --  IT (2893)

  |
   -- A. Ribaudo
           |
            -- Guidare sicuri (3859)

As you can see I kept only the subfolder that has the largest size, deleting the rest of the subfolders. Each subfolder has an identical number of words except for the text in the brackets

CodePudding user response:

Given two specific criteria that this scenario should meet.

  1. The depth of the folders to search from carlo is 2.
  2. The folder names have the same name with the exception of the lettering/digits in parenthesis (..).

Here's a solution I can propose based off that:

Get-ChildItem -Path 'C:\Users\Administrator\Desktop\confronto\carlo\*\*' -Directory  | 
    Group-Object -Property { ($_.BaseName -split ' \(')[0] } | 
    ForEach-Object -Process {
        $_.Group | Sort-Object -Property { (Get-ChildItem -Path $_.FullName).Length } -Descending |
            Select-Object -Skip 1 | Remove-Item -WhatIf
    } 

Since the naming convention is the same for the subfolders, you can group based off the lettering before the parenthesis by splitting into 2, then grabbing just the first string: ($_.BaseName -split ' \(')[0]. So, 22_11_63 (19989) becomes 22_11_63 allowing for a proper grouping of the objects.

Now, given the nature of how Sort-Object accumulates the objects before outputting them back to the console, you are able to sort using a scriptblock on what folder is largest in size: (Get-ChildItem -Path $_.FullName).Length, specifying -Descending reverses the order placing the largest folder at the top. Finally, you are able to pipe to Select-Object in order to skip the first object (which would be the largest folder), then passing the rest of the objects to Remove-Item for deletion.


Remove the -WhatIf common/safety parameter when you've dictated those are the results you are after.

  • Related