In the line of previous post (how to concatenate the content of a file with a increment of last number of the column), i need help with a little different issue.
Now i like to have increment (1 to 5th times) of every columns (can be 2nd, 3rd ....nth which certainly will start and end with "1" only) except the first column (which may start from 1 but end with any number).
input file:
TCTA 3 TCTG 1 TCTA 1
TCTA 4 TCTG 1 TCTA 1
TCTA 5 TCTG 1 TCTA 1
TCTA 6 TCTG 1 TCTA 1
TCTA 7 TCTG 1 TCTA 1
TCTA 8 TCTG 1 TCTA 1
TCTA 9 TCTG 1 TCTA 1
TCTA 10 TCTG 1 TCTA 1
TCTA 11 TCTG 1 TCTA 1
TCTA 12 TCTG 1 TCTA 1
TCTA 13 TCTG 1 TCTA 1
TCTA 14 TCTG 1 TCTA 1
TCTA 15 TCTG 1 TCTA 1
output required:
TCTA 3 TCTG 1 TCTA 1
TCTA 4 TCTG 1 TCTA 1
TCTA 5 TCTG 1 TCTA 1
TCTA 6 TCTG 1 TCTA 1
TCTA 7 TCTG 1 TCTA 1
TCTA 8 TCTG 1 TCTA 1
TCTA 9 TCTG 1 TCTA 1
TCTA 10 TCTG 1 TCTA 1
TCTA 11 TCTG 1 TCTA 1
TCTA 12 TCTG 1 TCTA 1
TCTA 13 TCTG 1 TCTA 1
TCTA 14 TCTG 1 TCTA 1
TCTA 15 TCTG 1 TCTA 1
TCTA 3 TCTG 2 TCTA 2
TCTA 4 TCTG 2 TCTA 2
TCTA 5 TCTG 2 TCTA 2
TCTA 6 TCTG 2 TCTA 2
TCTA 7 TCTG 2 TCTA 2
TCTA 8 TCTG 2 TCTA 2
TCTA 9 TCTG 2 TCTA 2
TCTA 10 TCTG 2 TCTA 2
TCTA 11 TCTG 2 TCTA 2
TCTA 12 TCTG 2 TCTA 2
TCTA 13 TCTG 2 TCTA 2
TCTA 14 TCTG 2 TCTA 2
TCTA 15 TCTG 2 TCTA 2
TCTA 3 TCTG 3 TCTA 3
TCTA 4 TCTG 3 TCTA 3
TCTA 5 TCTG 3 TCTA 3
TCTA 6 TCTG 3 TCTA 3
TCTA 7 TCTG 3 TCTA 3
TCTA 8 TCTG 3 TCTA 3
TCTA 9 TCTG 3 TCTA 3
TCTA 10 TCTG 3 TCTA 3
TCTA 11 TCTG 3 TCTA 3
TCTA 12 TCTG 3 TCTA 3
TCTA 13 TCTG 3 TCTA 3
TCTA 14 TCTG 3 TCTA 3
TCTA 15 TCTG 3 TCTA 3
TCTA 3 TCTG 4 TCTA 4
TCTA 4 TCTG 4 TCTA 4
TCTA 5 TCTG 4 TCTA 4
TCTA 6 TCTG 4 TCTA 4
TCTA 7 TCTG 4 TCTA 4
TCTA 8 TCTG 4 TCTA 4
TCTA 9 TCTG 4 TCTA 4
TCTA 10 TCTG 4 TCTA 4
TCTA 11 TCTG 4 TCTA 4
TCTA 12 TCTG 4 TCTA 4
TCTA 13 TCTG 4 TCTA 4
TCTA 14 TCTG 4 TCTA 4
TCTA 15 TCTG 4 TCTA 4
TCTA 3 TCTG 5 TCTA 5
TCTA 4 TCTG 5 TCTA 5
TCTA 5 TCTG 5 TCTA 5
TCTA 6 TCTG 5 TCTA 5
TCTA 7 TCTG 5 TCTA 5
TCTA 8 TCTG 5 TCTA 5
TCTA 9 TCTG 5 TCTA 5
TCTA 10 TCTG 5 TCTA 5
TCTA 11 TCTG 5 TCTA 5
TCTA 12 TCTG 5 TCTA 5
TCTA 13 TCTG 5 TCTA 5
TCTA 14 TCTG 5 TCTA 5
TCTA 15 TCTG 5 TCTA 5
I tried to incorporate the code from previous post but not success so far..
awk -v n=3 '
{
rec = rec $0 RS
}
1
END {
for (i=2; i<=n; i)
printf "%s", gensub(/[0-9] (\n|$)/, i "\\1", "g", rec)
}' file
Issue here is that it takes only last column, however i need any columns but first.
Please help.
Thanks
CodePudding user response:
It might be that the output is not that great looking, but the easiest would be something like this:
awk -v n=5 '{ for(i=1;i<=n; i) a[i,NR]=sprintf("%-8s%-4d%-8s%-4d%-8s%-4d",$1,$2,$3,i,$5,i) }
END { for(i=1;i<=n; i) for(j=1;j<=NR; j) print a[i,j] }' file
CodePudding user response:
Assumptions:
- first numeric column does not include the single digit
1
- for all other columns to be incremented the source/input column's value is
1
(ie, we're only going to increment columns that contain a single1
) - net result: replace all occurrences of the single digit
1
with incremented values - the number of columns is not known beforehand
- the number of columns could vary from row to row
Modifying OP's current awk
code to replace all standalone 1's
with an incremented value:
awk -v n=5 '
{
rec = rec $0 RS
}
1
END {
for (i=2; i<=n; i) {
x=gensub(/([^[:digit:]])1([^[:digit:]])/, "\\1" i "\\2", "g", rec)
printf "%s", gensub(/([^[:digit:]])1([^[:digit:]])/, "\\1" i "\\2", "g", x)
}
}' file
Where:
([^[:digit:]])1([^[:digit:]])
matches a non-digit character (capture group #1) a single1
a non-digit character (capture group #2)"\\1" i "\\2"
- replacement is capture group #1 the current increment value (i=2..5
in this instance) capture group #2- we perform 2x
gensub()
calls to address issue where there are consecutive numeric columns containing a single1
(NOTE: there may be a way to do this with a single function call but I'm drawing a blank at the moment ... open to suggestions from theawkers
in the community)
Using a modified input file to demonstrate the consecutive numeric column issue:
$ cat file
TCTA 3 TCTG 1 TCTA 1
TCTA 4 TCTG 1 TCTA 1
TCTA 5 TCTG 1 TCTA 1
TCTA 6 TCTG 1 TCTA 1
TCTA 7 TCTG 1 TCTA 1
TCTA 8 TCTG 1 TCTA 1
TCTA 9 TCTG 1 TCTA 1
TCTA 10 TCTG 1 TCTA 1
TCTA 11 TCTG 1 TCTA 1
TCTA 12 TCTG 1 TCTA 1
TCTA 13 TCTG 1 TCTA 1
TCTA 14 TCTG 1 TCTA 1 1 1
TCTA 15 TCTG 1 TCTA 1 1 1 1
This generates:
TCTA 3 TCTG 1 TCTA 1
TCTA 4 TCTG 1 TCTA 1
TCTA 5 TCTG 1 TCTA 1
TCTA 6 TCTG 1 TCTA 1
TCTA 7 TCTG 1 TCTA 1
TCTA 8 TCTG 1 TCTA 1
TCTA 9 TCTG 1 TCTA 1
TCTA 10 TCTG 1 TCTA 1
TCTA 11 TCTG 1 TCTA 1
TCTA 12 TCTG 1 TCTA 1
TCTA 13 TCTG 1 TCTA 1
TCTA 14 TCTG 1 TCTA 1 1 1
TCTA 15 TCTG 1 TCTA 1 1 1 1
TCTA 3 TCTG 2 TCTA 2
TCTA 4 TCTG 2 TCTA 2
TCTA 5 TCTG 2 TCTA 2
TCTA 6 TCTG 2 TCTA 2
TCTA 7 TCTG 2 TCTA 2
TCTA 8 TCTG 2 TCTA 2
TCTA 9 TCTG 2 TCTA 2
TCTA 10 TCTG 2 TCTA 2
TCTA 11 TCTG 2 TCTA 2
TCTA 12 TCTG 2 TCTA 2
TCTA 13 TCTG 2 TCTA 2
TCTA 14 TCTG 2 TCTA 2 2 2
TCTA 15 TCTG 2 TCTA 2 2 2 2
TCTA 3 TCTG 3 TCTA 3
TCTA 4 TCTG 3 TCTA 3
TCTA 5 TCTG 3 TCTA 3
TCTA 6 TCTG 3 TCTA 3
TCTA 7 TCTG 3 TCTA 3
TCTA 8 TCTG 3 TCTA 3
TCTA 9 TCTG 3 TCTA 3
TCTA 10 TCTG 3 TCTA 3
TCTA 11 TCTG 3 TCTA 3
TCTA 12 TCTG 3 TCTA 3
TCTA 13 TCTG 3 TCTA 3
TCTA 14 TCTG 3 TCTA 3 3 3
TCTA 15 TCTG 3 TCTA 3 3 3 3
TCTA 3 TCTG 4 TCTA 4
TCTA 4 TCTG 4 TCTA 4
TCTA 5 TCTG 4 TCTA 4
TCTA 6 TCTG 4 TCTA 4
TCTA 7 TCTG 4 TCTA 4
TCTA 8 TCTG 4 TCTA 4
TCTA 9 TCTG 4 TCTA 4
TCTA 10 TCTG 4 TCTA 4
TCTA 11 TCTG 4 TCTA 4
TCTA 12 TCTG 4 TCTA 4
TCTA 13 TCTG 4 TCTA 4
TCTA 14 TCTG 4 TCTA 4 4 4
TCTA 15 TCTG 4 TCTA 4 4 4 4
TCTA 3 TCTG 5 TCTA 5
TCTA 4 TCTG 5 TCTA 5
TCTA 5 TCTG 5 TCTA 5
TCTA 6 TCTG 5 TCTA 5
TCTA 7 TCTG 5 TCTA 5
TCTA 8 TCTG 5 TCTA 5
TCTA 9 TCTG 5 TCTA 5
TCTA 10 TCTG 5 TCTA 5
TCTA 11 TCTG 5 TCTA 5
TCTA 12 TCTG 5 TCTA 5
TCTA 13 TCTG 5 TCTA 5
TCTA 14 TCTG 5 TCTA 5 5 5
TCTA 15 TCTG 5 TCTA 5 5 5 5