I have 100 fasta files in one dir:
A_n.fasta
A_t.fasta
B_n.fasta
B_t.fasta
C_n.fasta
C_t.fasta
...
Fasta files contain different numbers of sequences and I need to put them together for each pair:
input A_n.fasta, A_t.fasta output A_nt.fasta ....
I am doing it for each pair like this:
cat A_n.fasta A_t.fasta > A_nt.fasta
but I don't want to do it manually for each id pair.
so I tried something like this:
for f in *_n.fasta ; do cat $f *_t.fasta > $f*_prot_all.fasta; done
but it's not working
for loop is new for me so probably the solution is easy, but I can not solve it.
Also, I will need it for pal2nal where I will need to put 2 files (protein multiple alignments and DNA sequences) and it will be the same work, 2 files with the same part of the name and need to output one file for each pair.
CodePudding user response:
Would you please try:
#!/bin/bash
for f in *_n.fasta; do
t=${f/_n./_t.}
out=${f/_n./_prot_all.}
cat "$f" "$t" > "$out"
done
where t=${f/_n./_t.}
replaces the substring _n.
with _t.
in the variable f
assigning a new variable t
to it. As for the pal2nal
command, you can make use of the similar replacements of the filenames.