Home > front end >  Search for pattern in column data from csv file
Search for pattern in column data from csv file

Time:07-13

I have a csv file with data like below

SYMM_ID         DATE                    INSTANCE            Total Response Time
297900076   01-06-2022 05:00    SG_SG_ORACLUL_L_PRDPRF  0.31
297900076   01-06-2022 05:05    SG_SG_ORACLUL_L_NPRDPRF 0.5
297900076   01-06-2022 14:50    SG_SG_ORACLUL_L_PRDPRF  0.62
297900076   01-06-2022 14:55    SG_SG_ORACLUL_L_PRDPRF  0.53
297900076   01-06-2022 15:00    SG_SG_ORACLUL_K_PRDPRF  0.61
297900076   01-06-2022 15:05    SG_SG_ORACLUL_M_PRDPRF  0.7
..............

I need to search for below patterens and fetch only the rows having these matching patterns

o   SG_SG_xxxxxxxx_L_NPRDPRF
o   SG_SG_xxxxxxxx_L_NPRDSTD
o   SG_SG_xxxxxxxx_L_PRDPRF
o   SG_SG_xxxxxxxx_L_PRDSTD

I was trying -match, but that not seems to be working

$GetData = Import-Csv -Path "C:\DIAG_2.csv" | Select SYMM_ID,DATE,INSTANCE,'Total Response Time'

$Local_Data = $GetData | where {($_.INSTANCE -match '_L_NPRDPRF') -and ($_.INSTANCE -match '_L_NPRDSTD')}

Please let me know how to do this

CodePudding user response:

I would use a regex like '^SG_SG_. _L_N??PRD(?:PRF|STD)$'.

Using your example data:

$Local_Data = $GetData | Where-Object { $_.Instance -match '^SG_SG_. _L_N??PRD(?:PRF|STD)$' }

will return

SYMM_ID   DATE             INSTANCE                Total Response Time
-------   ----             --------                -------------------
297900076 01-06-2022 05:00 SG_SG_ORACLUL_L_PRDPRF  0.31               
297900076 01-06-2022 05:05 SG_SG_ORACLUL_L_NPRDPRF 0.5                
297900076 01-06-2022 14:50 SG_SG_ORACLUL_L_PRDPRF  0.62               
297900076 01-06-2022 14:55 SG_SG_ORACLUL_L_PRDPRF  0.53 

Regex details:

^                 Assert position at the beginning of a line (at beginning of the string or after a line break character)
SG_SG_            Match the characters “SG_SG_” literally
.                 Match any single character that is not a line break character
                  Between one and unlimited times, as many times as possible, giving back as needed (greedy)
_L_               Match the characters “_L_” literally
N                 Match the character “N” literally
   ??             Between zero and one times, as few times as possible, expanding as needed (lazy)
PRD               Match the characters “PRD” literally
(?:               Match the regular expression below
                  Match either the regular expression below (attempting the next alternative only if this one fails)
      PRF         Match the characters “PRF” literally
   |              Or match regular expression number 2 below (the entire group fails if this one fails to match)
      STD         Match the characters “STD” literally
)                
$                 Assert position at the end of a line (at the end of the string or before a line break character)
  • Related