Home > OS >  process a hash table in powershell
process a hash table in powershell

Time:03-17

I want a Powershell script to tell me the youngest of each version of a file in a particular directory. Filenames are constructed from a basename and a timestamp

e.g. StaffTS2016_20220308010543.7z

StaffTS2016 is the basename I am interested in. The timestamp can be discarded since I refer to the LastAccessTime in the filesystem. I want to know the youngest version of StaffTS2016: there can be multiple occurences of files with this basename. I want to know the age of the newest, for each basename in the directory. This SO article has got me part way there

Get the age of a file using powershell

$strDays = @{l="Days";e={((Get-Date) - $_.LastAccessTime).Days}}
$path = "D:\HVBackup\VP20"
$output1 = Get-ChildItem $path -File | Select-Object Name, $strDays | Out-String
Name Days
StaffTS2016_20220308010543.7z 8
StaffTS2016_20220314231747.7z 1
vc2012R2DC1_20220308022625.7z 4
vc2012R2DC1_20220314230635.7z 2
vcApache_20220308023655.7z 5
vcApache_20220314235915.7z 1
vcEX1a2012_20220307230943.7z 8
vcSQL2016_20220308014925.7z 8
vcSQL2016_20220314223040.7z 1

I cannot figure out how to process the resulting hash table, either directly via extending the pipeline, or by post-processing, to derive the information I want.

As you can see, my hash table has two fields, Name and Days. From $.Name I want to extract BaseName as $.Name.split('_')[0]. Then I want to add (BaseName, Days) to my output table if BaseName does not exist in the output table, and update Days to the lower value if it does exist and the new Days is lower. I could do this in other languages but not Powershell.

The desired output is

Name Days
StaffTS2016 1
vc2012R2DC1 2
vcApache 1
vcEX1a2012 8
vcSQL2016 1

and this would tell me at a glance that there has been a problem with backing up vcEX1a2012

CodePudding user response:

As the name implies, Out-String turns the output from Get-ChildItem ... | Select ... into a string (not a hashtable), which undoubtedly makes the data cumbersome to work with.

Instead, start by grouping the files together based on the first part of the name, then sort the files in each group according to the timestamp:

# prepare calculated property selectors for both the name and the days for later
$selectors = @(
  @{Label="Days"; Expression = {((Get-Date) - $_.LastAccessTime).Days}}
  @{Label="Name"; Expression = { $_.BaseName.Split('_')[0] }}
)

# discover the files
$path = "D:\HVBackup\VP20"
$files = Get-ChildItem $path -File

# group the files by the first part of their base name
$files |Group-Object { $_.BaseName.Split('_')[0] } |ForEach-Object {
  # then pick only the newest from each, based on the timestamp part of the name
  $_.Group |Sort-Object { $_.BaseName.Split('_')[-1] } -Descending |Select-Object -First 1 -Property $selectors
}

CodePudding user response:

You could change the last command to something like this

$output1 = Get-ChildItem $path -File | Select-Object Name, $strDays |
  ForEach-Object { $_.Name = $_.Name.split('_')[0]; $_ } | 
  Group-Object -Property Name | 
  ForEach-Object { $_.Group | Sort-Object -Property Days | Select-Object -First 1 }

Group-Object lets you compare each file to the ones with the same name, which are set in the preceding ForEach-Object with $_.Name.split('_')[0] like you specified.

  • Related