I'm trying to re-use a regex I'm using to parse another file. This file has some commented rows, and I need to make sure the commented rows aren't captured.
This is the string being parsed:
m_dwErrorCode = 0;
m_dwOutError = HOP_OK;
m_OutSeverity = CCC_INFORMATION;
_stprintf(m_OutDevStr, _T(""));
if (0x00000000 & value)
{
m_dwErrorCode = 0x0;
/* Ready state. */
// m_StatusStr = " Ready(eSTATUS_READY)";
}
if (0x00000001 & value)
{
m_bProceeding = true;
/* proceed */
//m_StatusStr = " Proceeding(eSTATUS)";
}
if (0x00002000 & value)
{
m_bEmpty = true;
// We only want to check this error only at certain times.
if (m_bCheckEmpty)
{
if ((m_Attributes.dwMediaID == CUBE1) ||
(m_Attributes.dwMediaID == CUBE2) ||
/*(m_Attributes.dwMediaID == SCALLOPED) ||*/ // Added
(m_Attributes.dwMediaID == FOLDED))
{
m_dwErrorCode = 0x00002000;
_stprintf(m_OutDevStr, _T("0x1000 - %s(MP Tray Empty)"), errorStr);
m_dwOutError = HOP_TRAY_EMPTY;
m_OutSeverity = CCC_INFORMATION;
}
}
//HOP_TRAY_EMPTY
///* MSI empty. */
//m_bTrayEmpty = true;
//// m_StatusStr = " MSI empty(eSTATUS_MSI_EMPTY)";
}
if (0x00004000 & value)
{
/* empty. */
m_dwErrorCode = 0x4000;
_stprintf(m_OutDevStr, _T("0x4000 - %s(Tray 1 empty)"), errorStr);
m_dwOutError = HOP_TRAY_01_EMPTY;
m_OutSeverity = CCC_INFORMATION;
}
if (0x00008000 & value)
{
/* Tray 2 empty. */
m_dwErrorCode = 0x8000;
_stprintf(m_OutDevStr, _T("0x8000 - %s(Tray 2 empty)"), errorStr);
m_dwOutError = HOP_TRAY_02_EMPTY;
m_OutSeverity = CCC_INFORMATION;
}
if (0x00010000 & value)
{
/* Tray 3 empty. */
m_dwErrorCode = 0x10000;
_stprintf(m_OutDevStr, _T("0x10000 - %s(Tray 3 empty)"), errorStr);
m_dwOutError = HOP_TRAY_03_EMPTY;
m_OutSeverity = CCC_INFORMATION;
}
This is the code that gets it mostly right, except it captures the commented rows:
Function Get-CaseContents3240{
[cmdletbinding()]
Param ( [string]$parsedCaseMethod)
Process
{
# construct regex
$fullregex = [regex]"_stprintf[\s\S]*?_T\D*", # Start of error message, capture until digits
"(?<sdkErr>[x\d] )", # Error number, digits only with x
"\D[\s\S]*?", # match anything, non-greedy
"(?<sdkDesc>\((. ?)\))", # Error description, anything within parentheses, non-greedy
"([\s\S]*?OutError\s*=(?<sdkOutErr>\s[a-zA-Z_0-9]*))", # Capture OutErr string
"[\s\S]*?", # match anything, non-greedy
"(?<sdkSeverity>OutSeverity\s*=\s[a-zA-Z_]*)", # Capture severity string and parse out part after underscore later
'' -join ''
# run the regex on the method contents
$Values = $parsedCaseMethod | Select-String -Pattern $fullregex -AllMatches
# Convert Name-Value pairs to object properties
$result = foreach ($match in $Values.Matches){
[PSCustomObject][ordered]@{
sdkErr = $($match.Groups['sdkErr'])
sdkDesc = $($match.Groups['sdkDesc'])
sdkOutErr = $($match.Groups['sdkOutErr'])
sdkSeverity = ($match.Groups['sdkSeverity'] -split '_')[-1] #take part after _
}
}
#add in content that doesn't fall in pattern###################
#Write-Host "result:" $result -ForegroundColor Green
#$result;
return $result
}#End of Process
}#End of Function
This is what the results look like:
[Object[17]]
[0]:@{sdkErr=0x; sdkDesc=(tmpStr);sdkOutErr=HOP_OK;sdkSeverity=INFORMATION}
...
As you can see, the first one is picking up the commented out lines.
I tried doing this with the first regex line to fix it, but when I do that, the result set is empty:
^[\s] _stprintf[\s\S]*?_T\D*
This is the expected results:
sdkErr=0x1000 ###missed this before
sdkDesc=MP Tray Empty
sdkOutErr=HOP_TRAY_EMPTY
sdkSeverity=INFORMATION
sdkErr=0x4000
sdkDesc=Tray 1 empty
sdkOutErr=HOP_TRAY_01_EMPTY
sdkSeverity=INFORMATION
sdkErr=0x8000
sdkDesc=Tray 2 empty
sdkOutErr=HOP_TRAY_02_EMPTY
sdkSeverity=INFORMATION
sdkErr=0x10000
sdkDesc=Tray 3 empty
sdkOutErr=HOP_TRAY_03_EMPTY
sdkSeverity=INFORMATION
...
This is with PowerShell 5.1 and VS Code.
Update:
I'd like to keep the same data structure returned, just so everything is the same after the Function as what I have for other devices.
CodePudding user response:
It might be more maintainable to break it down into individual "if" blocks with one regex, and then parse each block in a second pass...
$code = Get-Content "myfile.c" -Raw;
# split into separate "if" blocks.
# (the funky "(?=...)" preserves the delimiter)
$blocks = $code -split "(?=if \(.* \& value\))";
# e.g.
# if (0x00004000 & value)
# {
# /* empty. */
# m_dwErrorCode = 0x4000;
# _stprintf(m_OutDevStr, _T("0x4000 - %s(Tray 1 empty)"), errorStr);
# m_dwOutError = HOP_TRAY_01_EMPTY;
# m_OutSeverity = CCC_INFORMATION;
# }
$pattern = `
"_stprintf[\s\S]*?_T\D*"
"(?<sdkErr>[x\d] )"
"\D[\s\S]*?"
"(?<sdkDesc>\((. ?)\))"
"[\s\S]*?"
"(OutError\s*=\s*(?<sdkOutErr>[a-zA-Z_0-9]*))"
"[\s\S]*?"
"(?<sdkSeverity>OutSeverity\s*=\s[a-zA-Z_]*)";
# note - skip first block as it's the preamble before the first "if"
$blocks `
| select-object -skip 1 `
| select-string -pattern $pattern `
| foreach-object {
$match = $_.Matches[0];
[PSCustomObject] [ordered] @{
"sdkErr" = $match.Groups['sdkErr']
"sdkDesc" = $match.Groups['sdkDesc']
"sdkOutErr" = $match.Groups['sdkOutErr']
"sdkSeverity" = ($match.Groups['sdkSeverity'] -split '_')[-1]
}
};
Output is:
sdkErr sdkDesc sdkOutErr sdkSeverity
------ ------- --------- -----------
0x1000 (MP Tray Empty) HOP_TRAY_EMPTY INFORMATION
0x4000 (Tray 1 empty) HOP_TRAY_01_EMPTY INFORMATION
0x8000 (Tray 2 empty) HOP_TRAY_02_EMPTY INFORMATION
0x10000 (Tray 3 empty) HOP_TRAY_03_EMPTY INFORMATION
CodePudding user response:
Not a robust solution, it does work for the code currently posted but I do not assure this will work with the actual code you might test it on.
The regex expects a single string, hence, when testing this with your file, make sure you're using the -Raw
switch.
See https://regex101.com/r/l0RLPw/1 for details.
$re = [regex]@'
(?xsi)
_stprintf\([\w_,\s] \("(?<code>\dx\d )\s*
-.*?\((?<description>[\w\s] )\)"\).*?;\s*
m_dwOutError\s*=\s*(?<error>[\w_] );\s*
m_OutSeverity\s*=\s*\w*?_(?<severity>\w )
'@
$content = Get-Content path/to/content.ext -Raw
foreach($match in $re.Matches($content)) {
[pscustomobject]@{
sdkErr = $match.Groups['code']
sdkDesc = $match.Groups['description']
sdkOutErr = $match.Groups['error']
sdkSeverity = $match.Groups['severity']
}
}
Result looks like this for me:
sdkErr sdkDesc sdkOutErr sdkSeverity
------ ------- --------- -----------
0x1000 MP Tray Empty HOP_TRAY_EMPTY INFORMATION
0x4000 Tray 1 empty HOP_TRAY_01_EMPTY INFORMATION
0x8000 Tray 2 empty HOP_TRAY_02_EMPTY INFORMATION
0x10000 Tray 3 empty HOP_TRAY_03_EMPTY INFORMATION