Home > Blockchain >  Bash Regex to extract everything between the last occurrence of a string (release-) and some charact
Bash Regex to extract everything between the last occurrence of a string (release-) and some charact

Time:10-18

I have multiple strings, where I want to extract everything between the last occurrence of a string (release-) and some characters (--). More specifically, for a sting like the following:

inte_integration-abc-abcde-abcdefg-release-release-PI_4.1-Sprint-3.1a--1.0.2-RELEASE

I want to have the following output:

PI_4.1-Sprint-3.1a

I created a regex online, which you can find here. There regex is the following:

.*release-(.*)--.*

However, when I am trying to use this script into a bash script, it wont work. Here is an example.

artifactoryVersion="inte_integration-abc-abcde-abcdefg-release-release-PI_4.1-Sprint-3.1a--1.0.2-RELEASE"

 [[ "$artifactoryVersion" =~ (.*release-(.*)--.*) ]]

 echo $BASH_REMATCH[0]
 echo $BASH_REMATCH[1]

Will return:

inte_integration-abc-abcde-abcdefg-release-release-PI_4.1-Sprint-3.1a--1.0.2-RELEASE[0]
inte_integration-abc-abcde-abcdefg-release-release-PI_4.1-Sprint-3.1a--1.0.2-RELEASE[1]

Do you have any ideas about how can I accomplish my goal in bash?

CodePudding user response:

You need to use the following:

#!/bin/bash

artifactoryVersion="inte_integration-abc-abcde-abcdefg-release-release-PI_4.1-Sprint-3.1a--1.0.2-RELEASE"
if [[ "$artifactoryVersion" =~ .*release-(.*)-- ]]; then
 echo ${BASH_REMATCH[1]};
fi

See the online demo

Output:

PI_4.1-Sprint-3.1a

CodePudding user response:

You may use:

s='inte_integration-abc-abcde-abcdefg-release-release-PI_4.1-Sprint-3.1a--1.0.2-RELEASE'
rx='.*-release-(.*)--'
[[ $s =~ $rx ]] && echo "${BASH_REMATCH[1]}"

PI_4.1-Sprint-3.1a

Code Demo

Your regex appears correct but make sure to use "${BASH_REMATCH[1]}" to extract first capture group in the result.

CodePudding user response:

With your shown samples please try following BASH code with regex. I have also mentioned comments before executing each statement to understand each statement here.

##Shell variable named var being created here.
var="inte_integration-abc-abcde-abcdefg-release-release-PI_4.1-Sprint-3.1a--1.0.2-RELEASE"

##Mentioning regex which needs to be checked on later in program.
regex="(.*release-release)-(.*)--"

##Check condition on var variable with regex if match found then print 2nd capturing group value.
[[ $var =~ $regex ]] && echo "${BASH_REMATCH[2]}"

Explanation of regex:

  • regex="(.*release-release)-(.*)--": Creating shell variable named regex in which putting regular expression (.*release-release)-(.*)--.
  • Where regex is creating 2 capturing groups.
  • First matching everything till release-release(with greedy match), which is followed by a -(not captured anywhere).
  • Which is followed by a greedy match, which will basically match everything before -- to get the exactly needed value.

CodePudding user response:

With standard shell parameter expansions (2x slower than a bash regex):

artifactoryVersion='inte_integration-abc-abcde-abcdefg-release-release-PI_4.1-Sprint-3.1a--1.0.2-RELEASE'

result=${artifactoryVersion##*-release-}
result=${result%%--*}

printf %s\\n "$result"
PI_4.1-Sprint-3.1a

And an other solution in bash with extended globbing:

#!/bin/bash
shopt -s extglob

artifactoryVersion='inte_integration-abc-abcde-abcdefg-release-release-PI_4.1-Sprint-3.1a--1.0.2-RELEASE'

echo "${artifactoryVersion//@(*-release-|--*)}"
PI_4.1-Sprint-3.1a
  • Related