I have Table 1 which has thousands of rows that looks like this:
chr1 4399801 4400245 peak_12659 719 . 32.37675 -1 1.92924 222
1 444
chr1 2495548 2495992 peak_11970 542 . 36.95443 -1 2.58372 222
1 444
chr1 3572002 3572264 peak_901 1000 . 148.62292 -1 3.94096 145
1 262
I want to remove the empty cells that appear in every other row (under chr1 in each row above), then combine these to each preceding row so the final table appears like this:
Table 2:
chr1 4399801 4400245 peak_12659 719 . 32.37675 -1 1.92924 222 1 444
chr1 2495548 2495992 peak_11970 542 . 36.95443 -1 2.58372 222 1 444
chr1 3572002 3572264 peak_901 1000 . 148.62292 -1 3.94096 145 1 262
How can I accomplish this?
Edit- In response to @Cyrus: I found it very difficult to find answers to this question. But I stumbled upon this thread (not exactly what I'm trying to accomplish) and tried the following:
awk '{printf "%s%s",$0,(NR%2?FS:RS)}' Table1.txt > Table2.txt
This command did not merge the alternating rows correctly and in some instances, combined cells instead: Screenshot here
I also tried:
xargs -n2 < Table1.txt > Table2.txt
Each row in the output contains two merged cells: Screenshot 2
CodePudding user response:
With GNU sed. Replace every \n\t
with \t
:
sed -z 's/\n\t/\t/g' file
Output:
chr1 4399801 4400245 peak_12659 719 . 32.37675 -1 1.92924 222 1 444 chr1 2495548 2495992 peak_11970 542 . 36.95443 -1 2.58372 222 1 444 chr1 3572002 3572264 peak_901 1000 . 148.62292 -1 3.94096 145 1 262
CodePudding user response:
It's not clear (to me) what the input and output field delimiters are supposed to be so I'm just going to strip out all white space and use an output field delimiter of tab (\t
):
awk '
BEGIN { OFS="\t" }
{ $1=$1 } # remove excessive white spaces
FNR%2==1 { a=$0 } # save odd numbered line
FNR%2==0 { print a,$0 } # print odd numbered line plus current (even numbered) line
' file
This generates:
chr1 4399801 4400245 peak_12659 719 . 32.37675 -1 1.92924 222 1 444
chr1 2495548 2495992 peak_11970 542 . 36.95443 -1 2.58372 222 1 444
chr1 3572002 3572264 peak_901 1000 . 148.62292 -1 3.94096 145 1 262