Home > database >  Add unique Output files names with loop in linux/bash
Add unique Output files names with loop in linux/bash

Time:11-18

I have 96 bam files, How do I output the txt file with the unique sample IDs? I am looping through the bam files, but need to assign unique output files. For example: SC845414.txt

#Typical Bam Files:
SC845414-CTGATCGT-GCGCATAT_Aligned.sortedByCoord.out.bam
SC845425-TGTGACTG-AGCCTATC_Aligned.sortedByCoord.out.bam

#!/bin/bash
#SBATCH --mem=110g
#SBATCH --cpus-per-task=12
#SBATCH --time=10-00:00:00

module load python

DIR=/PATH/*

for d in $DIR; do
    python -m HTSeq.scripts.count -s yes -f bam "$d" /PATH1/gencode.v35.annotation.gtf > /PATH3/HTseq/SC845414.txt
done

CodePudding user response:

It depends highly on what exactly you mean by "sample ID".

Based on your example, if you mean "the part of the filename before the first dash", then you could do this:

for d in $DIR; do
    id=$(basename "$d" | cut -f 1 -d -)
    python -m HTSeq.scripts.count -s yes -f bam "$d" /PATH1/gencode.v35.annotation.gtf > "/PATH3/HTseq/$id.txt"
done

CodePudding user response:

same; but using builtin variable interpolation instead of calling basename and cut

for d in $DIR; do
    fname=${d##*/}
    python -m HTSeq.scripts.count -s yes -f bam "$d" /PATH1/gencode.v35.annotation.gtf > "/PATH3/HTseq/${fname%%-*}.txt"
done

(edited to strip any leading path as well)

unfortunately stripping both the leading and trailing parts of a variable at once is beyond me (at the moment).

seems it should be do-able see: https://www.thegeekstuff.com/2010/07/bash-string-manipulation/

(no affiliation or endorsement; just first relevant web search)

  • Related