Passing Arguments to GNU parallel-CodePudding

I'm trying to use awk and GNU parallel to filter the files based on the values in column 1 and column 2 and dump the result in a single .csv.gz file. Thanks to the answer here, I could manage to write myscript.sh to do the job in parallel.

#!/bin/bash

doit() {
    pigz -dc $1 | awk -F, '$1>0.5 && $2<1.5'
}
export -f doit


find $1 -name '*.csv.gz' | parallel doit | pigz > output.csv.gz

and then run the script in the terminal.

./myscript.sh /path/to/files

I'm wondering how I can pass 0.5 and 1.5 as arguments of myscript.sh?

./myscript.sh /path/to/files 0.5 1.5

CodePudding user response：

This is may be an easier, or more explicit, way of passing variables and parameters around:

#!/bin/bash

dir="$1"
# Pick up second and third parameters, defaulting to 0.5 and 1.5 if unspecified
a=${2:-0.5}
b=${3:-1.5}

doit() {
    file=$1
    a=$2
    b=$3
    echo "File: $file, a=$a, b=$b"
    cat "$1" | awk -F, -v a="$a" -v b="$b" '$1>a && $2<b'
}
export -f doit

find "$dir" -name '*.tst' | parallel doit {} "$a" "$b"

CodePudding user response：

#!/bin/bash

doit() {
    pigz -dc $1 | awk -F, '$1>'$2' && $2<'$3
}
export -f doit


find $1 -name '*.csv.gz' | parallel doit {} $2 $3 | pigz > output.csv.gz

Call as:

paste <(seq 10 | shuf) <(seq 10 | shuf) | gzip > h.csv.gz
./myscript.sh . 5 6
zcat output.csv.gz