Move lines in file using awk/sed-CodePudding

Hi my files look like:

>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA
>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA

and I want to move the lines so that line 1 swaps with 3, and line 2 swaps with 4.

>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA
>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA

I have thought about using cut so cut send the lines into other files, and then bring them all back in the desired order using paste, but is there a solution using awk/sed.

EDIT: The file always has 4 lines (2 fasta entrys), no more.

CodePudding user response：

For such a simple case, as @Ed_Morton mentioned, you can just swap the even-sized slices with head and tail commands:

$ tail -2 test.txt; head -2 test.txt

>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA
>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA

CodePudding user response：

Generic solution with GNU tac to reverse contents:

$ tac -bs'>' ip.txt
>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA
>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA

By default tac reverses line wise but you can customize the separator.

Here, I'm assuming > can be safely used as a unique separator (provided to the -s option). The -b option is used to put the separator before the content in the output.

For moving last two lines to the top as well as inplace editing:

printf -- '-1,$m0\nwq\n' | ed -s ip.txt

CodePudding user response：

Using sed:

sed '1h;2H;1,2d;4G'

Store the first line in the hold space;
Add the second line to the hold space;
Don't print the first two lines;
Before printing the fourth line, append the hold space to it (i.e. append the 1st and 2nd line).

CodePudding user response：

GNU AWK manual has example of swapping two lines using getline as you know that

The file always has 4 lines (2 fasta entrys), no more.

then you might care only about case when number of lines is evenly divisble by 4 and use getline following way, let file.txt content be

>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA
>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA

then

awk '{line1=$0;getline line2;getline line3;getline line4;printf "%s\n%s\n%s\n%s\n",line3,line4,line1,line2}' file.txt

gives output

>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA
>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA

Explanation: store current line in variable $0, then next line as line2, yet next line as line3, yet next line as line4, use printf with 4 placeholders (%s) followed by newlines (\n), which are filled accordingly to your requirement.

(tested in GNU Awk 5.0.1)

CodePudding user response：

GNU sed:

sed -zE 's/(.*\r?\n)(.*\r?\n?)/\2\1/' file

A Perl:

perl -0777 -pe 's/(.*\R.*\R)(.*\R.*\R?)/\2\1/' file

A ruby:

ruby -ne 'BEGIN{lines=[]}
lines<<$_
END{puts lines[2...4] lines[0...2] }' file

Paste and awk:

paste -s file | awk -F'\t' '{print $3, $4, $1, $2}' OFS='\n'

A POSIX pipe:

paste -sd'\t\n' file | nl | sort -nr | cut -f 2- | tr '\t' '\n'

CodePudding user response：

This seems to work:

awk -F'\n' '{print $3, $4, $1, $2}' OFS='\n' RS= ORS='\n\n' file.txt