Home > front end >  Re-assembling split file names with Powershell
Re-assembling split file names with Powershell

Time:12-01

I'm having trouble re-assembling certain filenames (and discarding the rest) from a text file. The filenames are split up (usually on three lines) and there is always a blank line after each filename. I only want to keep filenames that begin with OPEN or FOUR. An example is:

OPEN.492820.EXTR
A.STANDARD.38383
333

FOUR.383838.282.
STAND.848484.NOR
MAL.3939

CLOSE.3480384.ST
ANDARD.39393939.
838383

The output I'd like would be:

OPEN.492820.EXTRA.STANDARD.38383333
FOUR.383838.282.STAND.848484.NORMAL.3939

Thanks for any suggestions!

CodePudding user response:

Read the file one line at a time and keep concatenating them until you encounter a blank line, at which point you output the concatenated string and repeat until you reach the end of the file:

# this variable will keep track of the partial file names
$fileName = ''

# use a switch to read the file and process each line
switch -Regex -File ('path\to\file.txt') {
  # when we see a blank line...
  '^\s*$' {
    # ... we output it if it starts with the right word
    if($s -cmatch '^(OPEN|FOUR)'){ $fileName }
    # and then start over
    $fileName = ''
  }

  default {
    # must be a non-blank line, concatenate it to the previous ones
    $s  = $_
  }
}

# remember to check and output the last one
if($s -cmatch '^(OPEN|FOUR)'){
  $fileName
}

CodePudding user response:

The following worked for me, you can give it a try.

$source = 'fullpath/to/inputfile.txt'
$destination = 'fullpath/to/resultfile.txt'

[regex]::Matches(
    (Get-Content $source -Raw),
    '(?msi)^(OPEN|FOUR).*?([\r\n\s] $|\z)'
).Value.ForEach({ -join($_ -split '\r?\n') }) |
Out-File $destination

For testing:

$txt = @'
OPEN.492820.EXTR
A.STANDARD.38383
333

FOUR.383838.282.
STAND.848484.NOR
MAL.3939

CLOSE.3480384.ST
ANDARD.39393939.
838383

OPEN.492820.EXTR
A.EXAMPLE123

FOUR.383838.282.
STAND.848484.123
ZXC
'@

[regex]::Matches($txt, '(?msi)^(OPEN|FOUR).*?([\r\n\s] $|\z)').Value.ForEach({
    -join($_ -split '\r?\n')
})

Output:

OPEN.492820.EXTRA.STANDARD.38383333
FOUR.383838.282.STAND.848484.NORMAL.3939
OPEN.492820.EXTRA.EXAMPLE123
FOUR.383838.282.STAND.848484.123ZXC
  • Related