shell: failed to save error stream code to file-CodePudding

I am trying to detect whenever the following script (random_fail.sh) fails --which happens rarely-- by running it inside a while loop in the second script (catch_error.sh):

#!/usr/bin/env bash
# random_fail.sh

 n=$(( RANDOM % 100 ))

 if [[ n -eq 42 ]]; then
    echo "Something went wrong"
    >&2 echo "The error was using magic numbers"
    exit 1
 fi

 echo "Everything went according to plan"

#!/usr/bin/env bash
# catch_error.sh

count=0  # The number of times before failing
error=0  # assuming everything initially ran fine

while [ "$error" != 1 ]; do
    # running till non-zero exit

    # writing the error code from the radom_fail script into /tmp/error
    bash ./random_fail.sh 1>/tmp/msg 2>/tmp/error

    # reading from the file, assuming 0 written inside most of the times
    error="$(cat /tmp/error)"

    echo "$error"

    # updating the count
    count=$((count   1))

done

echo "random_fail.sh failed!: $(cat /tmp/msg)"
echo "Error code: $(cat /tmp/error)"
echo "Ran ${count} times, before failing"

I was expecting that the catch_error.sh will read from /tmp/error and come out of the loop once a particular run of random_fail.sh exits with 1.

Instead, the catch script seems to be running forever. I think this is because the error code is not being redirected to the /tmp/error file at all.

Please help.

CodePudding user response：

You aren't catching the error code in the proper/usual manner. Also, no need to prefix the execution with the "bash" command, when it already contains the shebang. Lastly, curious why you don't simply use #!/bin/bash instead of #!/usr/bin/env bash .

Your second script should be modified to look like this:

#!/usr/bin/env bash
# catch_error.sh

count=0  # The number of times before failing
error=0  # assuming everything initially ran fine

while [ "$error" != 1 ]; do
    # running till non-zero exit

    # writing the error code from the radom_fail script into /tmp/error
    ./random_fail.sh 1>/tmp/msg 2>/tmp/error
    error=$?

    echo "$error"

    # updating the count
    count=$((count   1))

done

echo "random_fail.sh failed!: $(cat /tmp/msg)"
echo "Error code: ${error}"
echo "Ran ${count} times, before failing"

CodePudding user response：

[ "$error" != 1 ] is true if random_fail.sh prints a lone digit 1 to stderr. As long as this doesn't happen, your script will loop. You could instead test whether there has been written anything to stderr. There are several possibilities to achieve this:

printf '' >/tmp/error
while [[ ! -s /tmp/error ]]

error=
while (( $#error == 0 ))

error=
while [[ -z $error ]]

CodePudding user response：

/tmp/error will always be either empty or will contain the line "The error was using magic numbers". It will never contain 0 or 1. If you want to know the exit value of the script, just check it directly:

if ./random_fail.sh 1>/tmp/msg 2>/tmp/error; then error=1; else error=0; fi

Or, you can do:

./random_fail.sh 1>/tmp/msg 2>/tmp/error
error=$?

But don't do either of those. Just do:

while ./random_fail.sh; do ...; done

As long as random_fail.sh (please read https://www.talisman.org/~erlkonig/documents/commandname-extensions-considered-harmful/ and stop naming your scripts with a .sh suffix) returns 0, the loop body will be entered. When it returns non-zero, the loop terminates.