Home > Software design >  How delete rows contains the following extensions with regex pattern?
How delete rows contains the following extensions with regex pattern?

Time:09-30

I want to make the line deletion based on the column "Script or expected file(s)", this column contains either the word 'technical' or empty or the extensions. The deletion is done only if the column is empty or if it contains the following extensions: .txt_go
.zip
.prd
.xml
.go
.csv .txt or .xlsx
or containing _00* ??????????????

this is csv file:

Jobstream,"Jobstream Description","Op num","Job","Script or expected file(s)","Server","user","location","Job Description"

Jobstream,"Jobstream Description","Op num","Job","Script or expected file(s)","Server","user","location","Job Description"
\ACTO\data\out\PACTO500\f_ref12.prd,"WAIT TRGFIC-ACTO001","9","","\ACTO\data\out\PACTO500\f_ref12.prd","","","","START"
ACTOTRGAAA,"WAIT TRGFIC-ACTO001","10","PACTOAAA","technical","","","","ADDJOBSTREAM"
\prod\FACIMPORT\*.xml,"WAIT TRGFIC-ACTO001","9","","\prod\FACIMPORT\*.xml","","","","START"
ACTOTRGAAB,"WAIT TRGFIC-ACTO001","10","PACTOAAB","technical","","","","ADDJOBSTREAM"
\prod\TREATIESDECLARATIONIMPORT\*.xml,"WAIT TRGFIC-ACTO002","9","","\prod\TREATIESDECLARATIONIMPORT\*.xml","","","","START"
ACTOTRGAAC,"WAIT TRGFIC-ACTO002","10","PACTOAAD","technical","","","","ADDJOBSTREAM"
\prod\ACTO\data\in\PACTO003\*_desc.xml,"WAIT TRGFIC-ACTO560","9","","\prod\ACTO\data\in\PACTO003\*_desc.xml","","","","START"
ACTOTRGAAD,"WAIT TRGFIC-ACTO560","10","PACTOAAE","technical","","","","ADDJOBSTREAM"
ROD\ACTO\data\archives\f_ref12.prd.xml_????????_??????,"WAIT TRGFIC-ACTO999","9","","ROD\ACTO\data\archives\f_ref12.prd.xml_????????_??????","","","","START"
\REINSURANCE_DATA\client\01-Sources\*.xlsx,"WAIT TRGFIC-SHIN001","9","","\REINSURANCE_DATA\client\01-Sources\*.xlsx","","","","START"
SHINTRGAAA,"WAIT TRGFIC-SHIN001","10","PSHINAAB","technical","","","","ADDJOBSTREAM"
\prod\SHIN\data\in\PSHIN004\*.zip,"WAIT TRGFIC-SHIN003","9","","\prod\SHIN\data\in\PSHIN004\*.zip","","","","START"
\prod\AGPC\WEBX.001\flg\trt.go,"WAIT TRGFIC-WEBX001","9","","\prod\AGPC\WEBX.001\flg\trt.go","","","","START"
WEBXTRGAAA,"WAIT TRGFIC-WEBX001","10","PWEBXAAC","technical","","","","ADDJOBSTREAM"
\prod\AGPC\WEBX.002\in\PRTCP.csv,"Run Participations ACTOR","9","","\prod\AGPC\WEBX.002\in\PRTCP.csv","","","","START"
WEBXTRGAAB,"Run Participations ACTOR","10","PWEBXAAD","technical","","","","ADDJOBSTREAM"
\prod\AGPC\COPERNIC_LIL\WEBXL_COP\in\EC_AXACES.csv,"WAIT TRGFIC-WEBXWX2","9","","\prod\AGPC\COPERNIC_LIL\WEBXL_COP\in\EC_AXACES.csv","","","","START"
WEBXTRGAAC,"WAIT TRGFIC-WEBXWX2","10","PWEBXAAE","technical","","","","ADDJOBSTREAM"
\prod\WEBX\data\in\PWEBXWX1\LIL_AH_I100_WX101_00*,"WAIT TRGFIC-WEBX224","9","","\prod\WEBX\data\in\PWEBXWX1\LIL_AH_I100_WX101_00*","","","","START"
WRPT100Q,"REPORTXL","40","PWRPTTAG","technical","","","","Envoi mail utilisateurs"
WRPT-100Q-005T,"REPORTXL","45","PWRPT0B4","PWRPT-100Q-005T.BAT","PRAXCAPP02","AXA-CESSIONS\SVC_SCHEDULING","F WRPT-007","Export (DataPump) AGPC"
WRPT-100Q-015T,"REPORTXL","55","PWRPT0B6","PWRPT-100Q-015T.KSH","PRATFUDMGTW01","svcudmu","F WRPT-004","Transfert Fichiers DUMP"
WRPT-100Q-025T,"REPORTXL","75","PWRPT0CA","PWRPT-100Q-025T.BAT","PRAXCAPP02","AXA-CESSIONS\SVC_SCHEDULING","F WRPT-007","Import (DataPump) AGPC"

CodePudding user response:

Use Where-Object to filter your data:

# import data 
$data = Import-Csv .\path\to\file.csv

# define list of extensions to filter out
$excludedExtensions = -split @'
.txt_go
.zip
.prd
.xml
.go
.csv
.txt
.xlsx
'@

# filter data 
$data |Where-Object {
  foreach($extension in $excludedExtensions){
    if($_.'Script or expected file(s)' -like "*$extension"){
      # immediately return $false and filter out row if ANY extension matches
      return $false
    }
  }

  # finally check for *_00* and return $true if not found
  return $_.'Script or expected file(s)' -notlike '*_00*'
} |Export-Csv .\path\to\output.csv -NoTypeInformation
  • Related