Im attempting to add the number 128 to each line containing a number of column 6 of my file. When I use
awk '{ $6 =128; print }1' below_zn.pdb
I am able to do this, but the formatting of my file changes. I need to keep the formatting. I have tried
awk -F'()' '{ $6 =128; print }' below_zn.pdb
but instead of adding the number 128 to my column 6 numbers, I am seeing a new column at the farthest right made of the number 128 repeatedly. Is there a way I can use awk/sed/grep
to add 128 to my numbers in column 6 while keeping the formatting as follows:
ATOM 1 ZN ZN2 H 1 -13.264 34.400 10.700 1.00 0.00 HETA
ATOM 2 ZN ZN2 H 2 -13.264 25.273 10.700 1.00 0.00 HETA
ATOM 3 ZN ZN2 H 3 -13.264 43.527 10.700 1.00 0.00 HETA
ATOM 4 ZN ZN2 H 4 -13.264 52.654 10.700 1.00 0.00 HETA
ATOM 5 ZN ZN2 H 5 -13.175 29.836 14.467 1.00 0.00 HETA
Thank you!
CodePudding user response:
Assumptions:
- input is using fixed-width spacing
- white space only shows up as a column delimiter (ie, no column values contain white space)
- the values in column 6 are left-justified
Adding a new row to demonstrate a wider value for column 6:
$ cat below_zn.pdb
ATOM 1 ZN ZN2 H 1 -13.264 34.400 10.700 1.00 0.00 HETA
ATOM 2 ZN ZN2 H 2 -13.264 25.273 10.700 1.00 0.00 HETA
ATOM 3 ZN ZN2 H 3 -13.264 43.527 10.700 1.00 0.00 HETA
ATOM 4 ZN ZN2 H 4 -13.264 52.654 10.700 1.00 0.00 HETA
ATOM 5 ZN ZN2 H 5 -13.175 29.836 14.467 1.00 0.00 HETA
BUBBLE 206 ZN ZN2 H 7000 -13.175 29.836 14.467 1.00 0.00 HETA-HETA
One awk
idea:
awk '
BEGIN { regex1="^([^[:space:]] [[:space:]] ){5}" # match 1st 5 columns plus trailing white space
regex2="[^[:space:]] " # match non-white space characters (aka 6th column)
}
{ oldline=$0
match(oldline,regex1) # find 1st 5 columns
newline=substr(oldline,1,RSTART RLENGTH-1) # save 1st 5 columns for new line
oldline=substr(oldline,RSTART RLENGTH) # strip off 1st 5 columns
match(oldline,regex2) # match 1st column of shortened line (aka 6th column of original line)
newval=substr(oldline,1,RLENGTH) 128 # extract column and add 128
newlen=length(newval) # get length of new value
newline=newline newval substr(oldline,RSTART newlen) # append new value and rest of line to newline
print newline # print newline to stdout
}
' below_zn.pdb
This generates:
ATOM 1 ZN ZN2 H 129 -13.264 34.400 10.700 1.00 0.00 HETA
ATOM 2 ZN ZN2 H 130 -13.264 25.273 10.700 1.00 0.00 HETA
ATOM 3 ZN ZN2 H 131 -13.264 43.527 10.700 1.00 0.00 HETA
ATOM 4 ZN ZN2 H 132 -13.264 52.654 10.700 1.00 0.00 HETA
ATOM 5 ZN ZN2 H 133 -13.175 29.836 14.467 1.00 0.00 HETA
BUBBLE 206 ZN ZN2 H 7128 -13.175 29.836 14.467 1.00 0.00 HETA-HETA
CodePudding user response:
You may try rq
(https://github.com/fuyuncat/rquery/releases). Use a dynamic variable to pass the new column, and regreplace
to replace the column and keep the original format.
[ rquery]$ ./rq -v "v:0:@6 128" -q "p d/ /r | s regreplace(@raw,'(^\S [ ] \S [ ] \S [ ] \S [ ] \S [ ] )(\S )(.*)','\$1' @v '\$3')" samples/below_zn.pdb -m error
ATOM 1 ZN ZN2 H 129 -13.264 34.400 10.700 1.00 0.00 HETA
ATOM 2 ZN ZN2 H 130 -13.264 25.273 10.700 1.00 0.00 HETA
ATOM 3 ZN ZN2 H 131 -13.264 43.527 10.700 1.00 0.00 HETA
ATOM 4 ZN ZN2 H 132 -13.264 52.654 10.700 1.00 0.00 HETA
ATOM 5 ZN ZN2 H 133 -13.175 29.836 14.467 1.00 0.00 HETA