Home > Back-end >  Move lines in file using awk/sed
Move lines in file using awk/sed

Time:01-25

Hi my files look like:

>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA
>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA

and I want to move the lines so that line 1 swaps with 3, and line 2 swaps with 4.

>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA
>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA

I have thought about using cut so cut send the lines into other files, and then bring them all back in the desired order using paste, but is there a solution using awk/sed.

EDIT: The file always has 4 lines (2 fasta entrys), no more.

CodePudding user response:

For such a simple case, as @Ed_Morton mentioned, you can just swap the even-sized slices with head and tail commands:

$ tail -2 test.txt; head -2 test.txt

>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA
>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA

CodePudding user response:

Generic solution with GNU tac to reverse contents:

$ tac -bs'>' ip.txt
>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA
>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA

By default tac reverses line wise but you can customize the separator.

Here, I'm assuming > can be safely used as a unique separator (provided to the -s option). The -b option is used to put the separator before the content in the output.


For moving last two lines to the top as well as inplace editing:

printf -- '-1,$m0\nwq\n' | ed -s ip.txt

CodePudding user response:

Using sed:

sed '1h;2H;1,2d;4G'
  • Store the first line in the hold space;
  • Add the second line to the hold space;
  • Don't print the first two lines;
  • Before printing the fourth line, append the hold space to it (i.e. append the 1st and 2nd line).

CodePudding user response:

GNU AWK manual has example of swapping two lines using getline as you know that

The file always has 4 lines (2 fasta entrys), no more.

then you might care only about case when number of lines is evenly divisble by 4 and use getline following way, let file.txt content be

>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA
>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA

then

awk '{line1=$0;getline line2;getline line3;getline line4;printf "%s\n%s\n%s\n%s\n",line3,line4,line1,line2}' file.txt

gives output

>ID.2
GGAATACCACATCCCGCAGGGTTAAAAAAGAAAAAATCAGTAACAGTACTGGA
>ID.1
GGAACACGACATCCTGCAGGGTTAAAAAAGAAAAAATCAGTAAAAGTACTGGA

Explanation: store current line in variable $0, then next line as line2, yet next line as line3, yet next line as line4, use printf with 4 placeholders (%s) followed by newlines (\n), which are filled accordingly to your requirement.

(tested in GNU Awk 5.0.1)

CodePudding user response:

GNU sed:

sed -zE 's/(.*\r?\n)(.*\r?\n?)/\2\1/' file 

A Perl:

perl -0777 -pe 's/(.*\R.*\R)(.*\R.*\R?)/\2\1/' file

A ruby:

ruby -ne 'BEGIN{lines=[]}
lines<<$_
END{puts lines[2...4] lines[0...2] }' file 

Paste and awk:

paste -s file | awk -F'\t' '{print $3, $4, $1, $2}' OFS='\n'

A POSIX pipe:

paste -sd'\t\n' file | nl | sort -nr | cut -f 2- | tr '\t' '\n'

CodePudding user response:

This seems to work:

awk -F'\n' '{print $3, $4, $1, $2}' OFS='\n' RS= ORS='\n\n' file.txt
  • Related