Home > Software design >  Bash regex count hyphen backwards
Bash regex count hyphen backwards

Time:10-13

Given the following input format:

blablabla-blebleble-bliblibli-1.2.3-xxx-1blo

I want to have 2 regex to get:

  1. blablabla-blebleble-bliblibli
  2. 1.2.3-xxx-1blo

There's no guarantee that the first part ('blablabla-blebleble-bliblibli') will always have two -, but it's 100% certain that the second part ('1.2.3-xxx-1blo') will always have two -, and between part one and two will always have -

I managed to do this using cut but it's slow for massive operations. So I'm hoping that using bash regex can improve performance. Also, I tried this [^-] (?:-[^-] ){2}$ in regex101.com and it works, but in bash it doesn't:

test='blablabla-blebleble-bliblibli-1.2.3-xxx-1blo'
[[ $test =~ [^-] (?:-[^-] ){2}$ ]]; echo $BASH_REMATCH

CodePudding user response:

?: is used to signal a non-capturing group in other regex impl but this doesn't seem to work in bash: https://tldp.org/LDP/abs/html/x17129.html

Try without it:

[[ $test =~ [^-] (-[^-] ){2}$ ]] && echo matched=$BASH_REMATCH

Prints:

match=1.2.3-xxx-1blo

By the way I used the && operator and added additional printout to indicate whether the match operation succeeded.

Edit:

To get the first part you can make use of string stripping functionality of bash: https://tldp.org/LDP/abs/html/refcards.html#AEN22828

[[ $test =~ [^-] (-[^-] ){2}$ ]]
second=$BASH_REMATCH
first=${test%-$second}
echo $first $second

prints:

blablabla-blebleble-bliblibli 1.2.3-xxx-1blo

CodePudding user response:

Would you please try the following:

str="blablabla-blebleble-bliblibli-1.2.3-xxx-1blo"

if [[ $str =~ ^(.*)-([^-] -[^-] -[^-] )$ ]]; then
    echo "${BASH_REMATCH[1]}"
    echo "${BASH_REMATCH[2]}"
fi

Output:

blablabla-blebleble-bliblibli
1.2.3-xxx-1blo

BTW test is a reserved command name and it is recommended not to use it as a variable name.

  • Related