I have the following case:
regex: $'\[OK\][[:space:]] ([[:alnum:]_] )\.([[:alnum:]_] )([^[]*)'
text:
[OK] AAA.BBBBBB
aaabbbcccdddfffed
asdadadadadadsada
[OK] CCC.KKKKKKK
some text here
[OK] OKO.II
Now... If I will have the following code:
var_test=()
while [[ $text =~ $regex ]]; do
var_test =("${BASH_REMATCH[@]:1}")
text=${text#*"${BASH_REMATCH[0]}"}
done
declare -p var_test
I will have the correct output:
declare -a var_test=([0]="AAA" [1]="BBBBBB" [2]=$'\naaabbbcccdddfffed\nasdadadadadadsada\n' [3]="CCC" [4]="KKKKKKK" [5]=$'\nsome text here\n' [6]="OKO" [7]="II" [8]="")
But once I will convert it into a function like this:
function split_by_regex {
regex=$1
text=$2
groups=()
while [[ $text =~ $regex ]]; do
groups =("${BASH_REMATCH[@]:1}")
text=${text#*"${BASH_REMATCH[0]}"}
done
echo "${groups[@]}"
}
res=($(split_by_regex "$regex" "$text"))
declare -p res
I will get the wrong output:
declare -a res=([0]="AAA" [1]="BBBBBB" [2]="aaabbbcccdddfffed" [3]="asdadadadadadsada" [4]="CCC" [5]="KKKKKKK" [6]="some" [7]="text" [8]="here" [9]="OKO" [10]="II")
After some debug all it the error looks like it comes from the echo "${groups[@]}"
because if I will check the groups
within the function it looks as it should, but after I get the result from the function is not.
Sorry if this is an obvious question, but I am new to bash and shell scripting and I am trying to figure it out.
CodePudding user response:
Returning arrays from functions is tricky because whitespaces will, as you have noticed, be used to split the values in the array - and will therefore not be preserved.
I suggest using a nameref
instead.
function split_by_regex {
local -n groups=$1 # -n makes `groups` a reference to `res`
local regex=$2
local text=$3
while [[ $text =~ $regex ]]; do
groups =("${BASH_REMATCH[@]:1}")
text=${text#*"${BASH_REMATCH[0]}"}
done
}
declare -a res # declare `res` as an array
split_by_regex res "$regex" "$text" # pass in `res` as a parameter
declare -p res # prints the expected result
CodePudding user response:
Another approach is to declare the array outside of the function, If the work flow/requirements will allow it, something like:
regex=$'\[OK\][[:space:]] ([[:alnum:]_] )\.([[:alnum:]_] )([^[]*)'
text='[OK] AAA.BBBBBB
aaabbbcccdddfffed
asdadadadadadsada
[OK] CCC.KKKKKKK
some text here
[OK] OKO.II'
#: `declare -a groups` will work as well
#: Declare it outside of the function
groups=()
function split_by_regex {
local regex=$1
local text=$2
while [[ "$text" =~ $regex ]]; do
groups =("${BASH_REMATCH[@]:1}")
text=${text#*"${BASH_REMATCH[0]}"}
done
}
split_by_regex "$regex" "$text"
#: Now one can access/process the array `groups` outside of the function.
declare -p groups
- Without
nameref
, the above code is an alternative.