Replacement of "readarray -d / -t" (for splitting path) on older bash?-CodePudding

I need to run a script that uses readarray -d / -t to split filepaths into arrays, but the readarray of the targeted system doesn't support the -d option (bash version 4.2.46)

As I don't know the exact behaviour of readarray -d / -t, it is difficult for me to write a workaround for it. It doesn't seem possible to replace it with IFS=/ read -a because filenames containing control characters will break it, as shown here:

IFS=/ read -a arr < <(echo /home/fravadona/$'\n'/underneath)
declare -p arr
# OUTPUT: declare -a arr='([0]="" [1]="home" [2]="fravadona")'

So, my first question is, what's the expected result of:

readarray -d / -t arr < <(echo /home/fravadona)
readarray -d / -t arr < <(echo /home/fravadona/)
readarray -d / -t arr < <(echo /home/fravadona/$'\n'/underneath)

And lastly, does the following code emulates readline -d / -t correctly?

filepath=/home/fravadona/$'\n'/underneath

unset arr; declare -a arr

prefix="${filepath%%/*}"
suffix="${filepath#*/}"

until [ "X$prefix" == "X$suffix" ]
do
    arr =( "$prefix" )
    prefix="${suffix%%/*}"
    suffix="${suffix#*/}"
done

arr =( "$prefix" )

CodePudding user response：

Hope this would achieve what you expected :

filepath=/home/fravadona/$'\n'/underneath
IFS=/ read -d "" -r -a arr < <(printf "%s" "$filepath")
declare -p arr

CodePudding user response：

What is the exact behaviour of bash "readarray -d / -t"?

Read all input. Split it on /. Remove /. Assign to array.

what's the expected result of:

The following script:

readarray -d / -t arr < <(echo /home/fravadona)
declare -p arr
readarray -d / -t arr < <(echo /home/fravadona/)
declare -p arr
readarray -d / -t arr < <(echo /home/fravadona/$'\n'/underneath)
declare -p arr

outputs:

declare -a arr=([0]="" [1]="home" [2]=$'fravadona\n')
declare -a arr=([0]="" [1]="home" [2]="fravadona" [3]=$'\n')
declare -a arr=([0]="" [1]="home" [2]="fravadona" [3]=$'\n' [4]=$'underneath\n')

does the following code emulates readline -d / -t correctly?

Your code does not read from stdin, so it does not emulate readline. It seems you assume the input is stored in some variable.

The following code:

f() {
    IFS= read -d '' -r filepath
    arr=()

    prefix="${filepath%%/*}"
    suffix="${filepath#*/}"

    until [ "X$prefix" == "X$suffix" ]
    do
        arr =( "$prefix" )
        prefix="${suffix%%/*}"
        suffix="${suffix#*/}"
    done

    arr =( "$prefix" )
}
f < <(echo /home/fravadona)
declare -p arr
f < <(echo /home/fravadona/)
declare -p arr
f < <(echo /home/fravadona/$'\n'/underneath)
declare -p arr

outputs:

declare -a arr=([0]="" [1]="home" [2]=$'fravadona\n')
declare -a arr=([0]="" [1]="home" [2]="fravadona" [3]=$'\n')
declare -a arr=([0]="" [1]="home" [2]="fravadona" [3]=$'\n' [4]=$'underneath\n')

So it's the same, so the answer after some adjustments would be yes.

It's odd that you use [ X trick nawadays - the old bugs are long gone, and anyway == is non-standard for [.. In bash, just use [[ "$prefix" == "$suffix" ]]

Related: Need alternative to readarray/mapfile for script on older version of Bash . I would go with zero bytes, something like:

f() {
   arr=();
   while IFS= read -d '' -r elem || [[ -n "$elem" ]]; do
         arr =("$elem")
   done < <(
         tr '/' '\0'
   )
}

CodePudding user response：

It doesn't [seem] possible to replace it with IFS=/ read -a because filenames containing control characters will break it,

Specifically, filenames containing newlines will break it. Other control characters should not be an issue.

So, my first question is, what's the expected result of:

readarray -d / -t arr < <(echo /home/fravadona)
readarray -d / -t arr < <(echo /home/fravadona/)
readarray -d / -t arr < <(echo /home/fravadona/$'\n'/underneath)

They are equivalent to, respectively,

arr=('' home $'fravadona\n')
arr=('' home fravadona $'\n')
arr=('' home fravadona $'\n' $'underneath\n' )

Note that each array has an empty first element corresponding to the first "line", with the trailing delimiter removed.

And note in particular that the last element of every result contains a trailing newline. This is because echo ends its output with one by default, and you are instructing readarray to use a different line delimiter.

You might, then, also be interested in these ...

readarray -d / -t arr < <(echo -n /home/fravadona)
readarray -d / -t arr < <(echo -n /home/fravadona/)
readarray -d / -t arr < <(echo -n /home/fravadona/$'\n'/underneath)

... which produce the same results as ...

arr=('' home fravadona)
arr=('' home fravadona)
arr=('' home fravadona $'\n' underneath)

And lastly, does the following code emulates readline -d / -t correctly?

unset arr; declare -a arr

prefix="${filepath%%/*}"
suffix="${filepath#*/}"

until [ "X$prefix" == "X$suffix" ]
do
    arr =( "$prefix" )
    prefix="${suffix%%/*}"
    suffix="${suffix#*/}"
done

arr =( "$prefix" )

No, it does not, it least because the condition [ "X$prefix" == "X$suffix" ] is wrong. For example, if $filepath is foo/foo then no iterations of the loop will be performed, and the final value of arr will contain only one element, not two. Or if $filepath is /foo/bar/bar then the end result has only two elements, not three.

This would be better:

unset arr
declare -a arr

case $filepath in
*/) suffix=$filepath;;
*) suffix=${filepath}/;;
esac

while [[ -n "$suffix" ]]; do
  arr =(${suffix%%/*})
  suffix=${suffix#*/}
done