Home > Net >  I'm facing an error while converting my bash comand to shell script syntax error in shell scrip
I'm facing an error while converting my bash comand to shell script syntax error in shell scrip

Time:12-10

#!/bin/bash

set -o errexit
set -o nounset

#VAF_and_IGV_TAG

paste <(grep -v "^#" output/"$1"/"$1"_Variant_Filtering/"$1"_GATK_filtered.vcf | cut -f-5) \
      <(grep -v "^#" output/"$1"/"$1"_Variant_Filtering/"$1"_GATK_filtered.vcf | cut -f10-| cut -d ":" -f2,3) |
sed 's/:/\t/g' |
sed '1i chr\tstart\tend\tref\talt\tNormal_DP_VCF\tTumor_DP_VCF\tDP'|
awk 'BEGIN{FS=OFS="\t"}{sub(/,/,"\t",$6);print}' \
  > output/"$1"/"$1"_Variant_Annotation/"$1"_VAF.tsv

My above code ends up with a syntax error if I run this in the terminal without using the variable it shows no syntax error

sh Test.sh S1 Test.sh: 6: Test.sh: Syntax error: "(" unexpected

paste <(grep -v "^#" output/S1/S1_Variant_Filtering/S1_GATK_filtered.vcf | cut -f-5) \
      <(grep -v "^#" output/S1/S1_Variant_Filtering/S1_GATK_filtered.vcf | cut -f10-| cut -d ":" -f2,3) |
sed 's/:/\t/g' |
sed '1i chr\tstart\tend\tref\talt\tNormal_DP_VCF\tTumor_DP_VCF\tDP'|
awk 'BEGIN{FS=OFS="\t"}{sub(/,/,"\t",$6);print}' \
  > output/S1/S1_Variant_Annotation/S1_VAF.ts

My vcf file looks like this: https://drive.google.com/file/d/1HaGx1-3o1VLCrL8fV0swqZTviWpBTGds/view?usp=sharing

CodePudding user response:

You cannot use <(command) process substitution if you are trying to run this code under sh. Unfortunately, there is no elegant way to avoid a temporary file (or something even more horrid) but your paste command - and indeed the entire pipeline - seems to be reasonably easy to refactor into an Awk script instead.

#!/bin/sh

set -eu

awk -F '\t' 'BEGIN { OFS=FS;
        print "chr\tstart\tend\tref\talt\tNormal_DP_VCF\tTumor_DP_VCF\tDP' }
    !/#/ { p=$0; sub(/^([^\t]*\t){9}/, "", p);
           sub(/^[:]*:/, "", p); sub(/:.*/, "", p);
           sub(/,/, "\t", p);
           s = sprintf("%s\t%s\t%s\t%s\t%s\t%s", $1, $2, $3, $4, $5, p);
           gsub(/:/, "\t", s);
           print s
    }' output/"$1"/"$1"_Variant_Filtering/"$1"_GATK_filtered.vcf \
  > output/"$1"/"$1"_Variant_Annotation/"$1"_VAF.tsv

Without access to the VCF file, I have been unable to test this, but at the very least it should suggest a general direction for how to proceed.

CodePudding user response:

sh, does not support bash process substitution <(). The easiest way to port it is to write out two temporary files, and remove them via when via a trap when done. The better option is use a tool that is sufficiently powerful (i.e. sed) to do the filtering and manipulation required:

#!/bin/sh
header="chr\tstart\tend\tref\talt\tNormal_DP_VCF\tTumor_DP_VCF\tDP"
field_1_to_5='\(\([^\t]*\t\)\{5\}\)' # \1 to \2
field_6_to_8='\([^\t]*\t\)\{4\}[^:]*:\([^,]*\),\([^:]*\):\([^:]*\).*' # \3 to \6
src="output/${1}/${1}_Variant_Filtering/${1}_GATK_filtered.vcf"
dst="output/${1}/${1}_Variant_Variant_Annotation/${1}_VAF.tsv"
sed -n \
  -e '1i '"$header" \
  -e '/^#/!s/'"${field_1_to_5}${field_6_to_8}"'/\1\4\t\5\t\6/p' \
  "$src" > "$dst"

If you are using awk (or perl, python etc) just port the script to that language instead.

As an aside, all those repeated $1 suggest you should rework your file naming standard.

  • Related