Home > front end >  Bash cat/grep prevent real-time update
Bash cat/grep prevent real-time update

Time:06-11

I am running many separate experiments on a server. These experiments automatically redirect their stdout to an .out file specified by me. I need to determine if an experiment failed so I can restart it. A failure is due to a failed module installation, so I use the following command:

for file in $(find -name *.out); do grep -ol "No module" $file; done;

Which should just print the filenames of the failed runs. However, the scripts are python scripts which run with python -u (ensuring that content is flushed as it becomes available). Each script displays a tqdm progress bar, which effectively means these files are constantly updating. grep tends to hang on such files sometimes, and I'm not sure why, but it is often enough that it hangs after grepping just 8-10 files. I've tried

for file in $(find -name *.out); do tail -n 2 $file | grep  "No module" && echo $file ; done;

(which will need some adjustments to produce an identical output) but it suffers from the same problem. Even tail -n 2 struggles when a file is currently being updated. Is there any way to have bash take a snapshot of the file and ignore incoming updates if they occur, as I just need to look at the second last line (which contains the error).

CodePudding user response:

Avoid using tqdm if the output is not a tty

#! /usr/bin/env python3

import os
from tqdm import tqdm

if not os.isatty(1):
    def tqdm(iterable):
        return iterable

# you script here ...

CodePudding user response:

Assuming your scripts are piping stderr/fd(2) (as well as stdout/fd(1)) to these log files, you could disable-tqdm-on-non-terminal (i.e. file) output:

from sys import stderr
from tqdm import tqdm

for i in tqdm(..., disable=not isatty(2)):
    ...

Alternatively just ramp up the interval between progressive updates:

from tqdm import tqdm

for i in tqdm(..., mininterval=1):
    ...

This would mean fewer lines for grep to parse.

  • Related