Bash: what is the most efficient way to handle I/O?-CodePudding

I have a bash script which does a lot of string manipulations. As I know, reading from a file is slow. So instead of doing it every time I need its contents, I read the whole file at the beginning of the script

readarray -t lines < "$filename"

But every time I need to feed the lines to a program which accepts input (e.g., awk, cut, grep), I anyway have to print them and create a pipeline. Here's an example which finds the first line which contains a colon in a file

line=$(printf -- '%s\n' "${lines[@]}" | grep -n -m 1 :)

So I started wondering, didn't I just make it slower by making additional calls to echo and creating a pipeline? What's the best way to handle this situation?

CodePudding user response：

If what you want is to see which is faster, you can try to use time:

time readarray -t lines < "$filename"

time line=$(printf -- '%s\n' "${lines[@]}" | grep -n -m 1 :)

That will give you the time taken in milliseconds, and will let you see which one is faster.

CodePudding user response：

You can use the bash-specific <<< operator to pipe variables into commands without echo/printf-ing them:

λ printf "test\nline\n" > file
λ cat file
test
line
λ readarray -t lines < file
λ wc -c <<< "${lines[0]}"
5
λ printf "%s" "${lines[0]}"
test

Also instead of reading the file into a variable you could consume it directly with something like this assuming you dont need all the contents at once:

while read -r line; do
    grep -n -m1 ':' <<< "$line" && {
        echo "Got colon"
        break
    }
done < filename