I have many files that I would like to concatenate. Here is an example which generates those file names with some example content:
for lane in $(seq 1 4)
do
for sub in $(seq 1 8)
do
echo "sublib ${sub}, lane ${lane}, R1" > S${sub}_L00${lane}_R1.fastq.gz
echo "sublib ${sub}, lane ${lane}, R2" > S${sub}_L00${lane}_R2.fastq.gz
done
done
To concatenate these files using GNU parallel I've created the shell function below, but because these files are very large in practice, I would like enable a check which prompts the user for a "Yes" or "No" to verify the correct files are being combined before each job is executed.
Given that parallel executes each job simultaneously, is there a way to "pause" each job while the user input "Yes" or "No" is passed serially?
export FQ_DIR="/volume-general/test-fastqs/concat-demo/"
function concat_fq () {
files=$(find ${FQ_DIR} -type f -name "${1}_L*R${2}*")
printf "Files to concatenate:\n$files\n\n"
read -p "Proceed? [y/n]?" yn
case $yn in
[Yy]* ) cat "$files" > ${FQ_DIR}${1}_cat_R${2}.fastq.gz;;
[Nn]* ) echo "Aborting.";;
esac
}
export -f concat_fq
# example run
parallel -k --lb concat_fq {} ::: S1 S2 ::: 1 2
CodePudding user response:
GNU Parallel has very limited support for interactive programs.
You may use --interactive
to make GNU Parallel prompt you if a job should run or not.