Combining loop with awk-CodePudding

I need help combining an awk with a loop.

I have two files, one Bedfile.bed and a Samplelist.txt that look like this:

Bedfile.bed

HiC_scaffold_2  1       50001

HiC_scaffold_2  400001  450001

HiC_scaffold_2  800001  850001

Samplelist.txt

sampleA
sampleB
sampleC

I would like to create a new Bedfile for each sample (from the Samplelist.txt) in which I include the sample name as a new column next to each line, and I add the name in the output. Looking like this, e.g., for the first two sample

Bedfile_SampleA.bed

HiC_scaffold_2  1       50001 SampleA

HiC_scaffold_2  400001  450001 SampleA

HiC_scaffold_2  800001  850001 SampleA

Bedfile_SampleB.bed

HiC_scaffold_2  1       50001 SampleB

HiC_scaffold_2  400001  450001 SampleB

HiC_scaffold_2  800001  850001 SampleB

I have done this for one file but I have more than a hundred files, so I would like to do some sort of loop using a sample list.

awk ' {print $1"\t"$2"\t"$3"\t""SampleA"}' Bedfile.bed >  Bedfile_SampleA.bed

Any suggestion?

CodePudding user response：

You can do the operation and the loop all in AWK, but if you wanted to do the loop 'separately' for another reason, you could use:

while read -r sample
do
     awk -v var="$sample" 'BEGIN{OFS="\t"} {print $0, var}' bedfile.bed > bedfile_"$sample".bed
done < samplelist.txt

CodePudding user response：

Thus is very straightforward in awk. First you read the sample file in memory, and then you process the full bed-file

awk 'BEGIN{OFS="\t"}(FNR==NR){a[$0]; next}{for(i in a){f=FILENAME"."i; print $0,I}}' sample.txt bed.txt