Home > other >  Pass files with the same prefix into an array in powershell
Pass files with the same prefix into an array in powershell

Time:09-04

There are files in a directory that partly start the same way:

  • C001_200129.pdf
  • C001_29292.pdf
  • C001_ABCDF.pdf
  • C041_29292.pdf
  • C041_110101.pdf
  • P121_AAAA.pdf
  • P121_CCCC.pdf
  • P121_DDDD.pdf
  • P121_AKAKA.pdf

The files with the same prefix (which I do not know beforehand), I only know that the first 4 characters are the same, are to be merged.

pdftk.exe $a*.pdf CAT OUTPUT \Merged\$a-merged.pdf

How do I loop through to find all the files that have the same prefix and then pass them into the $a variable and merge them and then go directly to the next set of files that I can merge.

The goal here should be

  • C001_Merged.pdf
  • C041_Merged.pdf
  • P121_Merged.pdf

to be able to create

CodePudding user response:

Santiago Squarzon has provided the basic steps in a comment; let me flesh them out:

Get-ChildItem -Filter *.pdf |                 # Get all input files.
  Group-Object { ($_.Name -split '_')[0] } |  # Group them by shared prefix.
  ForEach-Object { # Process each group.
    pdftk.exe $_.Group.FullName CAT OUTPUT \Merged\$($_.Name)_merged.pdf
  }
  • Group-Object is used to group all *.pdf files in the current directory (add a -LiteralPath argument to target a different dir.) by their shared prefix (the start of the name up to, but not including _).

  • The .Name property of each resulting [Microsoft.PowerShell.Commands.GroupInfo] instance contains that shared prefix, and the .Group property contains all file-info objects that make up the group.

  • $_.Group.FullName uses member-access enumeration to get the full paths (.FullName) of all file-info objects in the group; passing the resulting array of file paths to an external program such as pdftk.exe results in the paths getting passed as individual arguments.

  • Related