Instantly redirect stdout and stderr from app to file with max size-CodePudding

In my startup script for app on vm I have:

exec /usr/java/latest/bin/java $JAVA_OPTS -jar $JAR_FILE >> /logs/$APP_NAME/startup.out 2>&1 &

This causes issues with the size of the startup.out file as it redirects whole stdout and stderr. I don't want it as my app creates a log files for each day and truncates it. This means my startup.out file duplicates everything outside the 'startup' stage from app.

I want to only redirect startup logs (let's say 3M) to a designated file in my script. I was trying something like:

exec /usr/java/latest/bin/java $JAVA_OPTS -jar $JAR_FILE 2>&1 | head -c3M >> /logs/$APP_NAME/startup.out &

But head does not redirect logs to file until it has 3 megabytes.

How can I instantly save to file the first 3 megabytes of logs from the app?

CodePudding user response：

The big problem with trying to use a pipeline where your application's output is redirected to head is that when head exits after reading the requested amount of data, you'll get a (usually fatal) SIGPIPE the next time your app tries to write to said output - because there's nothing listening any more.

My idea is to whip up a small program that can be used instead of head in the pipeline, reading from its standard input and writing to the real log file until the requested amount of data has been transferred. At that point it continues to read, but just disposes of the input instead of continuing to add it to the real log. That way your application never gets a premature SIGPIPE.

The following Linux-specific C program will do that efficiently:

#define _GNU_SOURCE

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

int main(int argc, char **argv) {
  if (argc != 3) {
    fprintf(stderr, "Usage: %s SIZE-IN-MB LOGFILE\n", argv[0]);
    return EXIT_FAILURE;
  }

  // Make sure standard input is a pipe
  struct stat sb;
  if (fstat(STDIN_FILENO, &sb) < 0) {
    fprintf(stderr, "Unable to stat standard input: %s\n", strerror(errno));
    return EXIT_FAILURE;
  }
  if (!S_ISFIFO(sb.st_mode)) {
    fprintf(stderr, "Standard input must be a pipe!\n");
    return EXIT_FAILURE;
  }

  // Calculate how many bytes to log
  size_t bytes = strtoul(argv[1], NULL, 10) * 1024 * 1024;

  int null_fd = open("/dev/null", O_WRONLY);
  if (null_fd < 0) {
    fprintf(stderr, "Unable to open /dev/null: %s\n", strerror(errno));
    return EXIT_FAILURE;
  }

  // The real log file
  int log_fd = open(argv[2], O_WRONLY | O_CREAT, 0644);
  if (log_fd < 0) {
    fprintf(stderr, "Unable to open log file '%s': %s\b", argv[2], strerror(errno));
    return EXIT_FAILURE;
  }
  if (lseek(log_fd, 0, SEEK_END) < 0) {
    fprintf(stderr, "Unable to seek to end of log file: %s\n", strerror(errno));
    return EXIT_FAILURE;
  }

  // Use splice(2) to efficiently move data from the stdin pipe to the log
  while (bytes > 0) {
    ssize_t transferred = splice(STDIN_FILENO, NULL, log_fd, NULL, bytes, SPLICE_F_MORE);
    if (transferred < 0) {
      fprintf(stderr, "Unable to copy data to log: %s\n", strerror(errno));
      break;
    } else if (transferred == 0) { // End of file
      return 0;
    } else {
      bytes -= transferred;
    }
  }
  close(log_fd); // Don't need this any more

  // Now loop until the main app exits, moving data it writes to /dev/null
  // to get rid of it.
  while (1) {
    ssize_t transferred = splice(STDIN_FILENO, NULL, null_fd, NULL, 4096, SPLICE_F_MORE);
    if (transferred < 0) {
      fprintf(stderr, "Unable to dump excess log data: %s\n", strerror(errno));
      return EXIT_FAILURE;
    } else if (transferred == 0) {
      // The writer exited, so shall we.
      return 0;
    }
  }
}

Example usage:

$ gcc -o logcopy -O -Wall -Wextra logcopy.c 
# real.log will grow to around 1mb and stop no matter how long you let the
# below command run
$ yes | ./logcopy 1 real.log

Your usage might be something like

exec /usr/java/latest/bin/java "$JAVA_OPTS" -jar "$JAR_FILE" 2>&1 | ./logcopy 3 "/logs/$APP_NAME/startup.out" &

CodePudding user response：

Let's say our program is a.out.

The rough idea is to pipe it to head, then pipe it to tee (using -a if appending to log).

For example, the following will write the first 2 lines:

./a.out | head -n 2 | tee startup.out

The problem with this simple solution is, as noted by @Shawn, that when head terminates all previous steps in the pipeline (incl a.out) will receive a SIGPIPE and will prematurely terminate.

So, a workaround would be to force a shell that ran head remain alive until the first process terminates. Lets make a FIFO complete.fifo and write there a datestamp when we terminate. The head subshell will wait (blocking read) until a datestamp is written.

mkfifo complete.fifo

(./a.out && date >complete.fifo) | \
    (head -n 2 && exec 5<complete.fifo && read -u 5 completed-at) | \
    tee startup.out

CodePudding user response：

exec stdbuf -oL /usr/java/latest/bin/java $JAVA_OPTS -jar $JAR_FILE 2>&1 |
    { head -c 3M >> /logs/$APP_NAME/startup.out; cat > /dev/null; }

There are two problems to overcome:

When stdout is redirected to a file glibc switches from line buffering to full buffering, where output is buffered up and written out in 4KB chunks. It's done for efficiency, but it means if the output is generated slowly you can't see the lines as they are printed.

You can use stdbuf to change this behavior from the outside. stdbuf -oL overrides the default behavior and forces line buffering.
When head is finished your program will receive a SIGPIPE indicating that the pipeline is closed. Normally this is helpful. It tells programs to stop if nobody is reading their output. But if you want your program to keep running after head exits you'll need to either ignore or prevent SIGPIPE.

You can keep the pipeline running by calling cat after head. cat will soak up excess output, keeping the pipeline running after head exits.