Home > database >  Piping the same output through one or several grep commands on condition
Piping the same output through one or several grep commands on condition

Time:03-05

I am currently writing a bash script to modify the output of my LaTeX compilations to have only what I find relevant printing on the console. As I would like this script to be extremely thorough, I set up different options to toggle different output filters at the same time depending of the nature of the informations given through the compilation (Fatal error, warning, over/underfull h/vbox...).

For those who may not know, we often need to perform several compilations in a row to have a full LaTeX document with correct labels, page numbering, index, table of contents... other commands like bibtex or makeglossaries for bibliography and, well, glossaries. I therefore have a loop that execute everything and stops if there is a fatal error encountered, but should continue if it is only a minor warning.

My main command line is piping the pdflatex output through a reversed grep that finds errors line (starting by !). Like this, the script stops only if grep found a fatal error.

: | pdflatex --halt-on-error $@ | { ! grep --color=auto '^!.*' -A200; }

But when I activate any other filters (eg. '*.full.*' for over/underfull lines), I need to be able to continue compiling to be able to identify it there is a major necessity to correct it (hey, sometimes, underfull lines are just not that ugly...).

That means my grep command cannot be inverted as in the first line, and I cannot (or don't know how to) use the same grep with a different regex. notice that if if using a different grep, it should also be read from the pdflatex output and I cannot pipe it directly following the above snippet.

To sum up, it should roughly look like this :

   pdflatex --> grep for fatal errors --> if more filters, grep for those filters
   --> pass to next step

I came up with several attempts that did not work properly :

This one works only if I want to compile WITH the warnings. Looking only for errors does not work.

latex_compilation() {
: | pdflatex --halt-on-error $@ | tee >({ ! grep --color=auto '^!.*' -A200; }) >({ grep --color=auto "$warnings_filter" -A5 };) >/dev/null
}


latex_compilation() {
: | pdflatex --halt-on-error $@ | tee >({ ! grep --color=auto '^!.*' -A200; }) >/dev/null | ({ grep --color=auto "$warnings_filter" -A5 };)
}

or even desperately

latex_compilation() {
: | pdflatex --halt-on-error $@ |
if [[ "$warnings_on" = true ]]; then
    { grep --color=auto "$warnings_filter" -A5 };
fi
{ ! grep --color=auto '^!.*' -A200; }
}

This one would work but uses 2 compilation processes for each step (you could easily go up to 7/8 compilations steps for a big and complex document). It should be avoided if possible.

latex_compilation() {
if [[ "$warnings_on" = true ]]; then
    : | pdflatex --halt-on-error $@ | \
    { grep --color=auto "$warnings_filter" -A5 };
fi
: | pdflatex --halt-on-error $@ | \
{ ! grep --color=auto '^!.*' -A200; }
}

I spent hours looking for solutions online, but didn't find any yet. I really hope this is clear enough because it is a mess to sum up, moreover writing it. You can find the relavant code here if needed for clarity.

CodePudding user response:

This one would work but uses 2 compilation processes

So let's use one.

latex_compilation() {
   local tmp
   tmp=$(pdflatext ... <&-)
   if [[ "$warnings_on" = true ]]; then
       grep --color=auto "$warnings_filter" -A5 <<<"$tmp"
   fi
   ! grep --color=auto '^!.*' -A200 <<<"$tmp"
}

Or you can do that asynchronously, by parsing the output, in your chosem programmign langauge. For Bash see https://mywiki.wooledge.org/BashFAQ/001 :

line_is_warning() { .... }
latex_compilation() {
   local outputlines=0 failed
   while IFS= read -r line; do
       if "$warnings_on" && line_is_warning "$line"; do
           outputlines=5  # will output 5 lines after
       fi
       if [[ "$line" =~ ^! ]]; then
           failed=1
           outputlines=200 # will output 200 lines after
       fi
       if ((outputlines != 0)); then
           ((outputlines--))
           printf "%s\n" "$line"
       fi
   done < <(pdflatext ... <&-)
   if ((failed)); then return 1; fi
}

But Bash will be extremely slow. Consider using AWK or Python or Perl.

looking for solutions online

Exactly, you have to write a solution yourself, for your specific requirements.

his one works only if I want to compile WITH the warnings. Looking only for errors does not work.

You can write whole code blocks inside >( ... ) and basically anywhere. The exit status of a pipeline is the exit status of rightmost command (except set -o pipefail). Put the failing command as the rightmost of the pipeline.

latex_compilation() {
    pdflatex --halt-on-error "$@" <&- |
    tee >(
       if "$warnings_on"; then
         grep --color=auto "$warnings_filter" -A5
       else
          cat >/dev/null
       fi
    ) |
    ! grep --color=auto '^!.*' -A200
}

CodePudding user response:

Suggesting to use awk filtering pattern.

Read more about awk filtering pattern here.

With awk you can create complex filtering patterns logic: !=not, &&=and, ||=or.

For example if you have 3 filtering RegExp patterns: Pattern_1, Pattern_2, Pattern 3.

Example 1

You can make a combined filter all 3 patterns in the following command:

awk '/Pattern_1/ && /Pattern_2/ && /Pattern_3/ 1' scanned_file1 scanned_file2 ...

The result will be printing only lines that match all 3 pattern.

Example 2

You can make a combined inverse filter all 3 pattern in the following command:

awk '!/Pattern_1/ && !/Pattern_2/ && !/Pattern_3/ 1' scanned_file1 scanned_file2 ...

The result will be printing lines not matching any of the 3 patterns.

Example 3

You can make a combined inverse filter Pattern_1 and match Pattern_2 or Pattern_3:

awk '!/Pattern_1/ && (/Pattern_2/ || /Pattern_3/)' scanned_file1 scanned_file2 ...

The result will be printing lines not matching Pattern_1 but match Pattern_2 or Pattern_3.

  • Related