I need to parse a structured file (FIX protocol 4.4) in powershell. The structure is like this
20220606-21:10:21.930 : 8=FIX.4.49=209 35=W34=35 49=FIXDIRECT.FT 52=20220606-21:10:21.925 56=MM_EUR_FIX_QS 55=US30 262=96 268=2 269=0 270=32921.6 271=2000000 299=16ynjsz-16ynjsz5qCaA 269=1 270=32931.4 271=2000000 299=16ynjsz-16ynjsz5qCaA 10=048
I need to pick only specific values following tags. I need the first value (timestamp) until the colon which does not have a tag number but then need to pick values following specific tag numbers. For example tag values 55, 270 and 271 (multiple 270 and 271 values exist here)
I am able to parse utilizing a simple ordered method of " "
and "="
as delimiters
$contents = Get-Content FIX.log
foreach($line in $contents) {
$s = $line.split("= ")
write-host $s[0] $s[17] $s[25] $s[27] $s[33] $s[35]
}
however I prefer to be able to pinpoint the value using the tag numbers as there are some lines in the file that do not conform to the same content.
Result should be something like this
20220606-21:10:21.930 US30 32921.6 2000000 32931.4 2000000
CodePudding user response:
Combine -split
, -match
, and -replace
as follows:
# Sample line that simulates your Get-Content call.
$content = '20220606-21:10:21.930 : 8=FIX.4.49=209 35=W34=35 49=FIXDIRECT.FT 52=20220606-21:10:21.925 56=MM_EUR_FIX_QS 55=US30 262=96 268=2 269=0 270=32921.6 271=2000000 299=16ynjsz-16ynjsz5qCaA 269=1 270=32931.4 271=2000000 299=16ynjsz-16ynjsz5qCaA 10=048'
foreach ($line in $content) {
# Split into fields, by " " or " : "
$first, [array] $rest = $line -split ' (?:: )?'
# Extract the tokens of interest:
# * Use the first one as-is
# * Among the remaining ones, use -match to filter in only
# those with the tag numbers of interest, then use -replace
# on the results to strip the tag number plus the separator ("=")
# from each.
$tokensOfInterest =
, $first (($rest -match '^(?:55|270|271)=') -replace '^. =')
# Output the resulting array as a single-line, space-delimited
# list, which is how Write-Host stringifies arrays.
# Note: Do NOT use Write-Host to output *data*.
Write-Host $tokensOfInterest
}
This yields the sample output in your question, namely:
20220606-21:10:21.930 US30 32921.6 2000000 32931.4 2000000
CodePudding user response:
Here is another take on the problem, using the .NET Regex
class.
$contents = Get-Content FIX.log
# Tags to search for, separated by RegEx alternation operator
$tagsPattern = '55|270|271'
foreach($line in $contents) {
# Extract the datetime field
$dateTime = [regex]::Match( $line, '^\d{8}-\d{2}:\d{2}:\d{2}\.\d{3}' ).Value
# Extract the desired tag values
$tagValues = [regex]::Matches( $line, "(?<= (?:$tagsPattern)=)[^ ] " ).Value
# Output everything
Write-Host $dateTime $tagValues
}
- The
[regex]::Match()
method matches the first instance of the given pattern and returns a singleMatch
object, whoseValue
property contains the matched value. - The
[regex]::Matches()
method finds all matches of the pattern. It returns a collection ofMatch
objects. With the aid of PowerShell's convenient member access enumeration feature, we directly create an array of allValue
properties. - Explanation and demos of the RegEx patterns at regex101.com: