Home > other >  Extract information via Powershell regex
Extract information via Powershell regex

Time:09-13

I am new in Powershell. I have a file that consists of the following lines. From this file, I want to extract v1.0.2 only via Powershell.

2022-09-08T10:52:38.0802281Z Downloading git::ssh://[email protected]/v3/basf-terraform/Terraform_modules/azure_private_endpoint?ref=v1.0.2 for resuc_dls1_pep1...

CodePudding user response:

There's a module I build that contains a lot of regular expressions, including one that could help here.

The module is called Irregular. You can get started by installing the module (Install-Module Irregular -Scope CurrentUser).

In it, there are two regular expressions that can help you out:

  • ?<Code_BuildVersion>
  • ?<Code_SemanticVersion>

To see the definitions for either, after you've installed and imported, run:

?<Code_BuildVersion>
?<Code_SemanticVersion>

In your string ?<Code_BuildVersion> that would match '38.0802281' (part of the timestamp) and 1.0.2. ?<Code_SemanticVersion> will match a version with at least 3 parts, and thus will only find the 1.0.2.

To make this work, there are a few options:

  1. Use ?<Code_SemanticVersion> to match
$LogLine = '2022-09-08T10:52:38.0802281Z Downloading git::ssh://[email protected]/v3/basf-terraform/Terraform_modules/azure_private_endpoint?ref=v1.0.2'
$logLine | ?<Code_SemanticVersion> | Select-Object -ExpandProperty Value
  1. Create a new regex based off of ?<Code_BuildVersion>, using 'v' as it's start:
$findVersion = [Regex]::New('
v
(?<Code_BuildVersion>
(?<Major>\d )
\.
(?<Minor>\d )
(?:\.(?<Build>\d ))?
(?:\.(?<Revision>\d ))?
)
', 'IgnorePatternWhitespace')

$findVersion.Matches('2022-09-08T10:52:38.0802281Z Downloading git::ssh://[email protected]/v3/basf-terraform/Terraform_modules/azure_private_endpoint?ref=v1.0.2') 
  1. Build your own quick regex to do this.

Basically, you can "distill" the regex above into a shorter form, including the v. Note that after the -split, we need to force the

$FindVersion = 'v\d \.\d (?:\.\d )?(?:\.\d )?'
$matched = 
'2022-09-08T10:52:38.0802281Z Downloading = git::ssh://[email protected]/v3/basf-terraform/Terraform_modules/azure_private_endpoint?ref=v1.0.2' -match $findVersion
$matches.0

I will also adjust the ?<Code_BuildVersion> regular expression so that it does not match when it is preceded by punctuation.

After this issue is fixed, you should be able to use either regex to extract out what you'd like.

  • Related