Say I have a filename string, something like:
test_ABC_19000101_010101.987.txt
,
Where "test" could be any combination of white space, characters, numbers, etc. I wish to extract the 19000101_010101
part (date and time) with Powershell. Currently I am assigning -split "_ABC_"
to a variable and taking the second element of the array. I am then splitting this string subsequent times. Is there a way to accomplish this in one go?
PS
"_ABC_"
is constant, occurring unchanged in all instances of filename(s).
CodePudding user response:
A more concise - albeit perhaps more obscure - alternative to Santiago Squarzon's helpful answer:
# Construct a regex that consumes the entire file name while
# using capture groups for the parts of interest.
$re = '. _ABC_(\d{4})(\d{2})(\d{2})_(\d{2})(\d{2})(\d{2})\.(\d{3})\.. '
[datetime] (
# In the replacement string, use $1, $2, ... to refer to what the
# first, second, ... capture group captured.
'test_ABC_19000101_010101.987.txt' -replace $re, '$1-$2-$3T$4:$5:$6.$7'
)
Output:
Monday, January 1, 1900 1:01:01 AM
The -replace
operation results in string '1900-01-01T01:01:01.987'
, which is a (culture-invariant) format that you can use as-is with a [datetime]
cast.
Note that with a Get-ChildItem
call as input you could slightly simplify the regex by providing $_.BaseName
rather than $_.Name
as the -replace
LHS, which obviates the need to also match the extension (.\.
) in the regex.
CodePudding user response:
This regex seems an overkill but I think it should work, as long as _ABC_
is constant and there is a _
to separate the date from the time and a .
to separate time from milliseconds:
$re = [regex]'(?<=_ABC_)(?<date>\d*)_(?<time>\d*)\.(?<millisec>\d*)(?=\.)'
@'
test_ABC_19000101_010101.987.txt
t' az@ 0est_ABC_20000101_090101.123.txt
tes8as712t_ABC_21000101_080101.456.txt
te098d $st_ABC_22000101_070101.789.txt
[test]_ABC_23000101_060101.101.txt
t?\est_ABC_24000101_050101.112.txt
'@ -split '\r?\n' | ForEach-Object {
$groups = $re.Match($_).Groups
$date = $groups['date']
$time = $groups['time']
$msec = $groups['millisec']
[datetime]::ParseExact(
"$date $time $msec",
"yyyyMMdd HHmmss fff",
[cultureinfo]::InvariantCulture
)
}
See https://regex101.com/r/8oSpqf/1 for details.
CodePudding user response:
If there will never be multiple sequences in the filename that appear as the timestamp (8 digits, _, 6 digits, then you could match on that pattern of digits.
PS C:\> 'test_ABC_19000101_010101.987.txt' -match '^.*ABC_(\d{8}_\d{6})\..*'
True
PS C:\> $Matches
Name Value
---- -----
1 19000101_010101
0 test_ABC_19000101_010101.987.txt
PS C:\> $Matches[1]
19000101_010101
You would use the filename instead of the explicit string.
If you want to get a [System.DateTime] from it:
PS C:\> [datetime]::ParseExact($Matches[1], 'yyyyMMdd_HHmmss', $null)
Monday, January 1, 1900 01:01:01