How to capture error messages from a program that fails only outside the terminal?-CodePudding

On a Linux server, I have a script here that will work fine when I start it from the terminal, but fail when started and then detached by another process. So there is probably a difference in the script's environment to fix.

The trouble is, the other process integrating that script does not provide access to the its error messages when it fails. What is an easy (and ideally generic) way to see the output of such a script when it's failing?

Let's assume I have no easy way to change the code of the application calling this script. The failure happens right at the start of the script's run, so there is not enough time to manually attach to it with strace to see its output. An automated solution to attach to it, maybe using a shell script, would be great.

(The specifics should not matter, but for what it's worth: the failing script is the backup script of Discourse, a widespread open source forum software. Discourse and this script are written in Ruby.)

CodePudding user response：

You can use a bash script that (1) does "busy waiting" until it sees the targeted process, and then (2) immediately attaches to it with strace and prints its output to the terminal.

#!/bin/sh

# Adapt to a regex that matches only your target process' full command.
name_pattern="bin/ruby.*spawn_backup_restore.rb"

# Wait for a process to start, based on its name, and capture its PID.
# Inspiration and details: https://unix.stackexchange.com/a/410075
pid=
while [ -z "$pid" ] ; do
    pid="$(pgrep --full "$name_pattern" | head -n 1)"

    # Set delay for next check to 1ms to try capturing all output.
    # Remove completely if this is not enough to capture from the start.
    sleep 0.001
done

echo "target process has started, pid is $pid"

# Print all stdout and stderr output of the process we found.
# Source and explanations: https://unix.stackexchange.com/a/58601
strace -p "$pid" -s 9999 -e write

CodePudding user response：

The idea is to substitute original script with wrapper which calls original script and saves its stdin and stderr to files. Wrapper may be like this:

#!/bin/bash

exec /path/to/original/script "$@" 1> >(tee /tmp/out.log) 2> >(tee /tmp/err.log >&2)

1> >(tee /tmp/out.log) redirects stdout to tee /tmp/out.log input in subshell. tee /tmp/out.log passes it to stdout but saves copy to the file.

2> >(tee /tmp/out.log) redirects stderr to tee /tmp/out.log input in subshell. tee /tmp/err.log >&2 passes it to stderr but saves copy to the file.

If script is invoked multiple times you may want to append stdout and stderr to files. Use tee -a in this case.

The problem is how to force caller to execute wrapper script instead of original one.

If caller invokes script in a way that it is searched in PATH you can put wrapper script to a separate directory and provide modified PATH to the caller. For example, script name is script. Put wrapper to /some/dir/script and run caller as

$ PATH="/some/dir:$PATH" caller

/path/to/original/script in wrapper must be absolute.

If caller invokes script from specific path then you can rename original script e.g. to original-script and name wrapper as script. In this case wrapper should call /path/to/original/original-script.

Another problem may rise if script behaves differently depending on name it's called. In this case exec -a ... may be needed.