Home > Software engineering >  No output when looping over numerous files using an awk script
No output when looping over numerous files using an awk script


I have 50 text files that looks like this. All the file names begin with ENSG00000...

"number"    "variant_id"    "gene_id"   "tss_distance"  "ma_samples"    "ma_count"  "maf"   "pval_nominal""slope"   "slope_se"  "hg38_chr"  "hg38_pos"  "ref_allele"    "alt_allele"    "hg19_chr"  "hg19_pos"  "ID"    "new_MAF"   "CHROM" "POS"   "REF"   "ALT"   "A1"    "OBS_CT"    "BETA"  "SE"    "P" "SD"    "Variance"
"14"    6253456 "chr1_17726150_G_A_b38" "ENSG00000186715.10"    955913  68  78  0.0644628   0.895156    0.0139683   0.105945    "chr1"  17726150    "G" "A" "chr1"  18052645    "rs260514:18052645:G:A0.058155  1   18052645    "G" "A" "G" 1597    0.0147047   0.0656528   0.822804    2.62364886486368    6.88353336610048

I want to get rid of the speech marks surrounding every value in the file, so it looks like this below. I am using the script below which works for when I try with one file.

number  variant_id  gene_id  tss_distance   ma_samples  ma_count    maf pval_nominal slope  slope_se    hg38_chr    hg38_pos    ref_allele  alt_allele  hg19_chr    hg19_pos    ID  new_MAF CHROM   POS REF ALT A1  OBS_CT  BETA    SE P    SD Variance
14  6253456 chr1_17726150_G_A_b38   ENSG00000186715.10  955913  68  78  0.0644628   0.895156    0.0139683   0.105945    chr1    17726150    G   A   chr1    18052645    rs260514:18052645:G:A0.058155   1   18052645    G   A   G   1597    0.0147047   0.0656528   0.822804    2.62364886486368    6.88353336610048

However, I want to apply the awk script to all 50 files via a loop. However, when I use the script below I get no output.

#PBS -N Edit
#PBS -l walltime=01:00:00
#PBS -l nodes=1:ppn=8
#PBS -l vmem=10gb
#PBS -m bea

for i in ENSG00000*; do
    awk '{ gsub(/"/, ""); print }' > $i.out

CodePudding user response:

Perhaps this is all you need

sed -i '' 's/"//g' ENSG00000*

if you want the file to be edited in-place.

CodePudding user response:

Using awk for just deleting certain characters is a bit of an overkill. Why don't you simply do a

tr -d '"' <$i >$i.out


CodePudding user response:

You forgot to specify the input file for awk to scan. Try:

for i in ENSG00000*; do
    awk '{ gsub(/"/, ""); print }' $i > $i.out

CodePudding user response:

Suggesting awk script without looping:

awk -i inplace '{ gsub(/"/, ""); print }' ENSG00000*

But the sed solution above is the best:

sed -i '' 's/"//g' ENSG00000*
  • Related