Home > Mobile >  Linux find xargs command grep showing path and filename
Linux find xargs command grep showing path and filename

Time:05-07

find /folder/202205??/ -type f | xargs head -50| grep '^Starting'

There are 20220501 20220502 20220503 and so on folders... This command searches all first 50 lines of all files in '/folder/202205??/' and shows the lines beginning with text "Starting"

I haven't the path and the filename of the files that are matched by the grep command. How can I get this information: path and filename and the matched line with a simple command?

CodePudding user response:

The main problem here is that head doesn't pass on the info about what lines came from which file, so grep can pick out the matching lines but not show the file name or path. awk can do the matching and trimming to 50 lines, and you can control exactly what gets printed for each match. So something like this:

find /folder/202205??/ -type f -exec awk '/^Starting/ {print FILENAME ": " $0}; (FNR>=50) {nextfile}' {}  

Explanation: the first clause in the awk script prints matching lines (prefixed by the FILENAME, which'll actually include the path as well), and the second skips to the next file when it gets to line 50. Also, I used find's -exec ... feature instead of xargs, just because it's a bit cleaner (and won't run into trouble with weird filenames). Terminating the -exec command with instead of \; makes it run the files in batches (like xargs) rather than one at a time.

CodePudding user response:

A relatively portable awk-based solution that provides for

  1. built-in realpath variant detection,

  2. shell-safe single-quotation (and escaping) for filenames, and

  3. grep-like output format :

    • file-full-realpath:line-number:[matched line contents..]

————————————————————————————————————————

  gfind 202…………/ -mindepth 1 
                 -type f 
                 -not -empty 
                 -not -name ".*" -print0 | 

  xargs -0 -n 20 -P 16 dash -c 'nice [mg]awk -e '\''

    # gawk profile, created Fri May  6 23:26:31 2022

    # BEGIN rule(s)

    BEGIN {
     1      __=substr("grealpath", 2^0^system("exit \140 which "\
                      "grealpath | grep -m 1 -ce .   \140 "))
     1      FS="^Starting"
    }

    # Rule(s)

  1020  50 < FNR { # 20
    20      nextfile
    }

  1000  FNR == 1 { # 20
    20      _ = getpath(FILENAME, __)
    }

  1000  -NF < -sub("^",(_)":"(FNR)":",$0) {
        print
    }
   
    20  function getpath(_,____,__,___)
    {
    20      return "-"==_ \
            ? "/dev/stdin" \
            : substr((___=RS)*(RS="\0")*gsub(/\47/,"\47\134&\47",_),
                             \
                    ((__=(____)" -zePq \47"(_)"\47 ")|getline _)~"",
                      __*close(__)^(RS=___))(_)
    
    }'\'' "${@}" ' _

CodePudding user response:

I am sure this is not perfect. But it might give some new ideas.

Be aware, that filenames with special characters like newlines are not handled correctly in this solution !!

while IFS=: read -r -a a; do [[ ${a[1]} -gt 50 ]] && break; printf "%s\n" "${a[0]}"; done < <( grep -rnH '^Starting' /folder/202205??/ | sort -t":" -k2,2n )

This bash snippet is written in one line, but actually with pretty printing it is more than one.

while IFS=: read -r -a a; do
  [[ ${a[1]} -gt 50 ]] && break
  printf "%s\n" "${a[0]}"
done < <( grep -rnH '^Starting' /folder/202205??/ | sort -t":" -k2,2n )

grep can go recursive through directories using -r and shows the line number -n and the filename -H. The sort is done on the line number. The loop stops on line number greater 50. Till then it prints the filename.

Depending on what you want, you can output the line number and/or the string found.

If you need the information inside something else, where the line number can be handled, the simple grep might lead you to a better solution:

grep -rnH '^Starting' /folder/202205??/

I am sure the output can be put to something like awk which stops the output if the number in the second field is greater than 50. Unfortunately I am no awk expert.

  • Related