How to add an empty line at the end of these commands?-CodePudding

I am in a situation where I have so many fastq files that I want to convert to fasta. Since they belong to the same sample, I would like to merge the fasta files to get a single file.

I tried running these two commands:

sed -n '1~4s/^@/>/p;2~4p' INFILE.fastq > OUTFILE.fasta

cat infile.fq | awk '{if(NR%4==1) {printf(">%s\n",substr($0,2));} else if(NR%4==2) print;}' > file.fa

And the output files is correctly a fasta file.

However I get a problem in the next step. When I merge files with this command:

cat $1 >> final.fasta

The final file apparently looks correct. But when I run makeblastdb it gives me the following error:

FASTA-Reader: Ignoring invalid residues at position(s): On line 512: 1040-1043, 1046-1048, 1050-1051, 1053, 1055-1058, 1060-1061, 1063, 1066-1069, 1071-1076

Looking at what's on that line I found that a file header was put at the end of the previous file sequence. And it turns out like this:

GGCTTAAACAGCATT>e45dcf63-78cf-4769-96b7-bf645c130323

So how can I add a blank line to the end of the file within the scripts that convert fastq to fasta?

So that when I merge they are placed on top of each other correctly and not at the end of the sequence of the previous file.

CodePudding user response：

So how can I add a blank line to the end of the file within the scripts that convert fastq to fasta?

I would use GNU sed following replace

cat $1 >> final.fasta

using

sed '$a\\n' $1 >> final.fasta

Explanation: meaning of expression for sed is at last line ($) append newline (\n) - this action is undertaken before default one of printing. If you prefer GNU AWK then you might same behavior following way

awk '{print}END{print ""}' $1 >> final.fasta

Note: I was unable to test any of solution as you doesnot provide enough information to this. I assume above line is somewhere inside loop and $1 is always name of file existing in current working directory.