I have this array:
dihedrals=['na-2e-na-cd 4 1.200 180.000 2.000', 'Pd-2e-na-cd 4 1.200 180.000 2.000', 'Pd-2e-na-ca 4 1.200 180.000 2.000', 'Pd-4n-na-hn 4 4.800 0.000 2.000', 'na-4n-cc-cc 4 4.200 180.000 2.000', 'na-2e-na-ca 4 1.200 180.000 2.000', 'Pd-2e-na-ca 4 1.200 180.000 2.000', 'cc-4n-na-hn 4 4.800 0.000 2.000', 'Pd-4n-na-cd 4 4.800 0.000 2.000', 'Pd-2e-na-cc 4 1.200 180.000 2.000', 'X -4n-na-X 2 3.400 180.000 2.000', 'Pd-4n-cc-h4 4 4.200 180.000 2.000', 'Pd-4n-cc-cc 4 4.200 180.000 2.000', 'na-2e-na-cd 4 1.200 180.000 2.000', 'na-2e-na-cc 4 1.200 180.000 2.000', 'cc-4n-na-cd 4 4.800 0.000 2.000', 'na-2e-na-ca 4 1.200 180.000 2.000', 'Pd-2e-na-cc 4 1.200 180.000 2.000', 'na-2e-na-cc 4 1.200 180.000 2.000', 'Pd-2e-na-cd 4 1.200 180.000 2.000', 'na-4n-cc-h4 4 4.200 180.000 2.000']
and I want to write it in a file like this:
na-2e-na-cd 4 1.200 180.000 2.000
Pd-2e-na-cd 4 1.200 180.000 2.000
Pd-4n-na-hn 4 4.800 0.000 2.000
na-4n-cc-cc 4 4.200 180.000 2.000
na-2e-na-ca 4 1.200 180.000 2.000
cc-4n-na-hn 4 4.800 0.000 2.000
Pd-4n-na-cd 4 4.800 0.000 2.000
Pd-2e-na-cc 4 1.200 180.000 2.000
X -4n-na-X 2 3.400 180.000 2.000
Pd-4n-cc-h4 4 4.200 180.000 2.000
Pd-4n-cc-cc 4 4.200 180.000 2.000
na-2e-na-cc 4 1.200 180.000 2.000
cc-4n-na-cd 4 4.800 0.000 2.000
na-4n-cc-h4 4 4.200 180.000 2.000
I tried:
!awk '{print $1" "$2" "$3" "$4" "$5}' a.txt
But awk sees extra field in this row: "X -4n-na-X "
because there is a space next to X
. I tried to change the field separator as two spaces with-F="[[:space:]][[:space:]] "
:
import os
for x in range(len(dihedrals)):
dihedrals[x]=os.popen('echo "{}" |awk -F="[[:space:]][[:space:]] " \'{{ printf "%0s %0s %0s %0s %0s",$1,$2,$3,$4,$5,$6}}\' '.format(dihedrals[x])).read()
print(dihedrals[x])
But nothing changed.
I also tried printf %s
:
import os
for x in range(len(dihedrals)):
dihedrals[x]=os.popen('echo "{}"|awk \'{{printf "%0s %3s %8s s s",$1,$2,$3,$4,$5}}\' '.format(dihedrals[x])).read()
But again it didn't work. How can I write my variable into a file as I explained above?
I also tried python formatting, regex, exc... but I couldn't accomplish.
NOTE: I also tried column -t a.txt
but again I am in trouble with X space row (X -4n-na-X
) Here is result:
na-2e-na-cd 4 1.200 180.000 2.000
Pd-2e-na-cd 4 1.200 180.000 2.000
Pd-2e-na-ca 4 1.200 180.000 2.000
Pd-4n-na-hn 4 4.800 0.000 2.000
na-4n-cc-cc 4 4.200 180.000 2.000
na-2e-na-ca 4 1.200 180.000 2.000
Pd-2e-na-ca 4 1.200 180.000 2.000
cc-4n-na-hn 4 4.800 0.000 2.000
Pd-4n-na-cd 4 4.800 0.000 2.000
Pd-2e-na-cc 4 1.200 180.000 2.000
X -4n-na-X 2 3.400 180.000 2.000
Pd-4n-cc-h4 4 4.200 180.000 2.000
Pd-4n-cc-cc 4 4.200 180.000 2.000
na-2e-na-cd 4 1.200 180.000 2.000
na-2e-na-cc 4 1.200 180.000 2.000
cc-4n-na-cd 4 4.800 0.000 2.000
na-2e-na-ca 4 1.200 180.000 2.000
Pd-2e-na-cc 4 1.200 180.000 2.000
na-2e-na-cc 4 1.200 180.000 2.000
Pd-2e-na-cd 4 1.200 180.000 2.000
CodePudding user response:
You can use formatted output in python for this array. We just need to split each line using 2 spaces to get individual fields.
import re
dihedrals=['na-2e-na-cd 4 1.200 180.000 2.000', 'Pd-2e-na-cd 4 1.200 180.000 2.000', 'Pd-2e-na-ca 4 1.200 180.000 2.000', 'Pd-4n-na-hn 4 4.800 0.000 2.000', 'na-4n-cc-cc 4 4.200 180.000 2.000', 'na-2e-na-ca 4 1.200 180.000 2.000', 'Pd-2e-na-ca 4 1.200 180.000 2.000', 'cc-4n-na-hn 4 4.800 0.000 2.000', 'Pd-4n-na-cd 4 4.800 0.000 2.000', 'Pd-2e-na-cc 4 1.200 180.000 2.000', 'X -4n-na-X 2 3.400 180.000 2.000', 'Pd-4n-cc-h4 4 4.200 180.000 2.000', 'Pd-4n-cc-cc 4 4.200 180.000 2.000', 'na-2e-na-cd 4 1.200 180.000 2.000', 'na-2e-na-cc 4 1.200 180.000 2.000', 'cc-4n-na-cd 4 4.800 0.000 2.000', 'na-2e-na-ca 4 1.200 180.000 2.000', 'Pd-2e-na-cc 4 1.200 180.000 2.000', 'na-2e-na-cc 4 1.200 180.000 2.000', 'Pd-2e-na-cd 4 1.200 180.000 2.000', 'na-4n-cc-h4 4 4.200 180.000 2.000']
for i in dihedrals:
a = re.split(' {2,}', i)
print( "%-11s %2s %8s s s" % (a[0], a[1], a[2], a[3], a[4]) )
Output:
na-2e-na-cd 4 1.200 180.000 2.000
Pd-2e-na-cd 4 1.200 180.000 2.000
Pd-2e-na-ca 4 1.200 180.000 2.000
Pd-4n-na-hn 4 4.800 0.000 2.000
na-4n-cc-cc 4 4.200 180.000 2.000
na-2e-na-ca 4 1.200 180.000 2.000
Pd-2e-na-ca 4 1.200 180.000 2.000
cc-4n-na-hn 4 4.800 0.000 2.000
Pd-4n-na-cd 4 4.800 0.000 2.000
Pd-2e-na-cc 4 1.200 180.000 2.000
X -4n-na-X 2 3.400 180.000 2.000
Pd-4n-cc-h4 4 4.200 180.000 2.000
Pd-4n-cc-cc 4 4.200 180.000 2.000
na-2e-na-cd 4 1.200 180.000 2.000
na-2e-na-cc 4 1.200 180.000 2.000
cc-4n-na-cd 4 4.800 0.000 2.000
na-2e-na-ca 4 1.200 180.000 2.000
Pd-2e-na-cc 4 1.200 180.000 2.000
na-2e-na-cc 4 1.200 180.000 2.000
Pd-2e-na-cd 4 1.200 180.000 2.000
na-4n-cc-h4 4 4.200 180.000 2.000
A gnu-awk solution would be:
... |
awk -F ' {2,}' -v RS=', *|\\]' '
gsub(/dihedrals=\[|\047/, "") {
printf( "%-11s %2s %8s s s\n", $1, $2, $3, $4, $5)
}'
CodePudding user response:
Assumptions:
- 1st column always consists of 11 characters
I don't use python
so I'll simulate the behavior (python
making repeated calls out to awk
) with a bash
array and a bash/for
loop that calls awk
:
Setup:
declare -a dihedrals=([0]="na-2e-na-cd 4 1.200 180.000 2.000" [1]="Pd-2e-na-cd 4 1.200 180.000 2.000" [2]="Pd-2e-na-ca 4 1.200 180.000 2.000" [3]="Pd-4n-na-hn 4 4.800 0.000 2.000" [4]="na-4n-cc-cc 4 4.200 180.000 2.000" [5]="na-2e-na-ca 4 1.200 180.000 2.000" [6]="Pd-2e-na-ca 4 1.200 180.000 2.000" [7]="cc-4n-na-hn 4 4.800 0.000 2.000" [8]="Pd-4n-na-cd 4 4.800 0.000 2.000" [9]="Pd-2e-na-cc 4 1.200 180.000 2.000" [10]="X -4n-na-X 2 3.400 180.000 2.000" [11]="Pd-4n-cc-h4 4 4.200 180.000 2.000" [12]="Pd-4n-cc-cc 4 4.200 180.000 2.000" [13]="na-2e-na-cd 4 1.200 180.000 2.000" [14]="na-2e-na-cc 4 1.200 180.000 2.000" [15]="cc-4n-na-cd 4 4.800 0.000 2.000" [16]="na-2e-na-ca 4 1.200 180.000 2.000" [17]="Pd-2e-na-cc 4 1.200 180.000 2.000" [18]="na-2e-na-cc 4 1.200 180.000 2.000" [19]="Pd-2e-na-cd 4 1.200 180.000 2.000" [20]="na-4n-cc-h4 4 4.200 180.000 2.000")
Proposed code:
for x in "${dihedrals[@]}"
do
awk '{ f1=substr($0,1,11)
split(substr($0,12),a)
printf "s %2s %7s s s\n",f1,a[1],a[2],a[3],a[4]}' <<< "${x}"
done
This generates:
na-2e-na-cd 4 1.200 180.000 2.000
Pd-2e-na-cd 4 1.200 180.000 2.000
Pd-2e-na-ca 4 1.200 180.000 2.000
Pd-4n-na-hn 4 4.800 0.000 2.000
na-4n-cc-cc 4 4.200 180.000 2.000
na-2e-na-ca 4 1.200 180.000 2.000
Pd-2e-na-ca 4 1.200 180.000 2.000
cc-4n-na-hn 4 4.800 0.000 2.000
Pd-4n-na-cd 4 4.800 0.000 2.000
Pd-2e-na-cc 4 1.200 180.000 2.000
X -4n-na-X 2 3.400 180.000 2.000
Pd-4n-cc-h4 4 4.200 180.000 2.000
Pd-4n-cc-cc 4 4.200 180.000 2.000
na-2e-na-cd 4 1.200 180.000 2.000
na-2e-na-cc 4 1.200 180.000 2.000
cc-4n-na-cd 4 4.800 0.000 2.000
na-2e-na-ca 4 1.200 180.000 2.000
Pd-2e-na-cc 4 1.200 180.000 2.000
na-2e-na-cc 4 1.200 180.000 2.000
Pd-2e-na-cd 4 1.200 180.000 2.000
na-4n-cc-h4 4 4.200 180.000 2.000
From a performance perspective I'd think the same (awk
) logic should be doable within python
thus eliminating the need for the repeated calls out to awk
... ???
CodePudding user response:
Do you have to use awk? It seems like something along these lines would accomplish the same goal in plain Python:
with open('a.txt', 'w') as fp:
fp.write('\n'.join(dihedrals))