Home > Blockchain >  Divide output of wc-l by 4 in a for loop in bash?
Divide output of wc-l by 4 in a for loop in bash?

Time:03-24

I'm trying to write a for loop that unzips fastq.gz files that contain R1 in the file name, determines # of lines in each file, and divides # of lines by 4. Ideally I could also write this into a txt file with two columns (file name and # of lines/4).

This loop unzips R1 fastq files and deterimnes # of lines in each file but does not divide by 4 (or save output into a txt file).

for i in $(ls ./R1); do gzcat ./$i | wc -l done;

Other posts on here suggest using bc to divide in bash, but I haven't been able to integrate this into a loop.

CodePudding user response:

You never use for i in $(ls anything), see Bash Pitfalls #1. Your loop will fail for filenames with spaces or any other special characters. For most circumstances, you simply iterate over the files with for i in path/*; do ..., but understand that can fail if the filenames contain the '\n' character as part of the name. The optimal for handling all filenames is to use find as while read -r name; do ... done < <(find path -type f -name "*.gz") (note process substitution, < <(...) is a bash only construct, pipe to the loop if using POSIX shell)

Next, to write the name and number of lines / 4 to a new file, wrap your entire loop in a new scope between { .... } and simply redirect all output at once to the new file.

You should also add validations to check if the file is a directory ending in gz and skip any found, as well as skipping any empty file (zero file size)

If you it altogether, you could do something like:

{
for i in R1/*.gz; do
  [ -d "$i" ] && continue                 ## skip any directories
  [ -s "$1" ] && continue                 ## skip empty files
  nlines=$(gzcat "$i" | wc -l)            ## get number of lines
  printf "%s\t%s\n" "$i" $((nlines / 4))  ## output name, nlines / 4
done
} > newfile         ## redirect all output to newfile

(output is written with a tab character "\t" separating the name and number / 4 -- adjust as desired)

Look things over and let me know if you have any questions.

CodePudding user response:

This would work, if you allow that 5 / 4 = 1 (so rounded down to the nearest integer). If you want to work with decimals (5 / 4 = 1.25) then you'll need bc or awk

for i in $(ls ./R1); do 
  nb_lines=$(gzcat ./$i | wc -l)
  echo $((nb_lines / 4))
done;

CodePudding user response:

The simpliest way to do integer arithmetic is using the $((...)) notation, as you can see from these simple examples:

Prompt> echo $((2*6))
12
Prompt> echo $((20/4))
5
Prompt> echo $((21/4))
5

It can also be used in combination with other commands, like wc -l:

Prompt> cat .viminfo | wc -l
287
Prompt> echo $(($(cat .viminfo | wc -l) / 4))
71
  • Related