I am trying to detect whenever the following script (random_fail.sh
) fails --which happens rarely-- by running it inside a while loop in the second script (catch_error.sh
):
#!/usr/bin/env bash
# random_fail.sh
n=$(( RANDOM % 100 ))
if [[ n -eq 42 ]]; then
echo "Something went wrong"
>&2 echo "The error was using magic numbers"
exit 1
fi
echo "Everything went according to plan"
#!/usr/bin/env bash
# catch_error.sh
count=0 # The number of times before failing
error=0 # assuming everything initially ran fine
while [ "$error" != 1 ]; do
# running till non-zero exit
# writing the error code from the radom_fail script into /tmp/error
bash ./random_fail.sh 1>/tmp/msg 2>/tmp/error
# reading from the file, assuming 0 written inside most of the times
error="$(cat /tmp/error)"
echo "$error"
# updating the count
count=$((count 1))
done
echo "random_fail.sh failed!: $(cat /tmp/msg)"
echo "Error code: $(cat /tmp/error)"
echo "Ran ${count} times, before failing"
I was expecting that the catch_error.sh will read from /tmp/error and come out of the loop once a particular run of random_fail.sh exits with 1.
Instead, the catch script seems to be running forever. I think this is because the error code is not being redirected to the /tmp/error file at all.
Please help.
CodePudding user response:
You aren't catching the error code in the proper/usual manner. Also, no need to prefix the execution with the "bash" command, when it already contains the shebang. Lastly, curious why you don't simply use #!/bin/bash instead of #!/usr/bin/env bash .
Your second script should be modified to look like this:
#!/usr/bin/env bash
# catch_error.sh
count=0 # The number of times before failing
error=0 # assuming everything initially ran fine
while [ "$error" != 1 ]; do
# running till non-zero exit
# writing the error code from the radom_fail script into /tmp/error
./random_fail.sh 1>/tmp/msg 2>/tmp/error
error=$?
echo "$error"
# updating the count
count=$((count 1))
done
echo "random_fail.sh failed!: $(cat /tmp/msg)"
echo "Error code: ${error}"
echo "Ran ${count} times, before failing"
CodePudding user response:
[ "$error" != 1 ]
is true if random_fail.sh
prints a lone digit 1
to stderr. As long as this doesn't happen, your script will loop. You could instead test whether there has been written anything to stderr. There are several possibilities to achieve this:
printf '' >/tmp/error
while [[ ! -s /tmp/error ]]
or
error=
while (( $#error == 0 ))
or
error=
while [[ -z $error ]]
CodePudding user response:
/tmp/error
will always be either empty or will contain the line "The error was using magic numbers". It will never contain 0 or 1. If you want to know the exit value of the script, just check it directly:
if ./random_fail.sh 1>/tmp/msg 2>/tmp/error; then error=1; else error=0; fi
Or, you can do:
./random_fail.sh 1>/tmp/msg 2>/tmp/error
error=$?
But don't do either of those. Just do:
while ./random_fail.sh; do ...; done
As long as random_fail.sh
(please read https://www.talisman.org/~erlkonig/documents/commandname-extensions-considered-harmful/ and stop naming your scripts with a .sh
suffix) returns 0, the loop body will be entered. When it returns non-zero, the loop terminates.