Home > Net >  How can I use Bash to pull a series of strings from one file and replace a separate series of string
How can I use Bash to pull a series of strings from one file and replace a separate series of string

Time:04-21

I have two files: file1.txt and file2.txt which both have a set of coordinates amongst other information.

new_coords=$(sed -n '/Begin/,/End/{//b;p}' file1.txt)
new_coords=$(echo "${new_coords//ATOMIC_POSITIONS (angstrom)}")
old_coords=$(sed -n '/ATOMIC_POSITIONS/,/K_POINTS/{//b;p}' file2.txt)

sed -i 's|$old_coords|$new_coords|g' file2.txt

I figured the fastest way to access the coordinates would be to find the nearest lines of text above and below to separate them. Only issue with the "new_coords" variable I have is that there is an "ATOMIC_POSITIONS" label that is repeated throughout until the program achieves final atomic coordinates at the every end, so I remove this with an echo statement in line 2.

When I echo the variables new_coords and old_coords, I seem to get the correct output for each, but sed or perl doesn't seem to work for the final line of code.

How would one do this sort of multiple string control in Bash? Is there some small piece of code or formatting I'm missing?

Example of file1.txt:

...
     bfgs converged in   9 scf cycles and   8 bfgs steps
     (criteria: energy <  6.0E-05 Ry, force <  1.0E-04 Ry/Bohr)

     End of BFGS Geometry Optimization

     Final energy   =   -1343.8825757257 Ry
Begin final coordinates

ATOMIC_POSITIONS (angstrom)
Fe            1.0730540812        3.7648438571        1.4484500000
Fe            3.2976459188        0.6816561429        1.4484500000
Fe            3.2584040812        2.9049061429        0.0000000000
Fe            1.1122959188        1.5415938571        0.0000000000
C             2.1853500000        2.2232500000        1.4484500000
C             0.0000000000        0.0000000000       -0.0000000000
End final coordinates



     Writing output data file ./out/100Co2C.save/

     init_run     :    486.03s CPU    493.55s WALL (       1 calls)
...

Example of file2.txt:

...
ATOMIC_SPECIES
C      12.0107 C.pbesol-n-kjpaw_psl.1.0.0.UPF
Co     58.933195 co_pbesol_v1.2.uspp.F.UPF
Fe     55.845 Fe.pbesol-spn-kjpaw_psl.0.2.1.UPF
ATOMIC_POSITIONS angstrom
Co            1.0085465598        3.7287218832        1.4484500000
Co            3.3775861828        0.7084455291        1.4484500000
Fe            3.2243420022        2.9272726906        0.0000000000
Co            1.1549449803        1.5394425244        0.0000000000
C             2.1517221305        2.1838545768        1.4484500000
C             0.0096081444        0.0285127960        0.0000000000
K_POINTS crystal
388
    0.0000000000     0.0000000000     0.0000000000 1
...

CodePudding user response:

You can use

new_coords=$(sed -n '/Begin/,/End/{//b;p}' file1)
new_coords=$(echo "${new_coords//ATOMIC_POSITIONS (angstrom)}" | sed '/^$/d')
old_coords=$(sed -n '/ATOMIC_POSITIONS/,/K_POINTS/{//b;p}' file2)

quoteRe() { sed -e 's/[^^]/[&]/g; s/\^/\\^/g; $!a\'$'\n''\\n' <<< "$1" | tr -d '\n'; }
quoteSubst() {
  IFS= read -d '' -r < <(sed -e ':a' -e '$!{N;ba' -e '}' -e 's/[&/\]/\\&/g; s/\n/\\&/g' <<<"$1")
  printf %s "${REPLY%$'\n'}"
}

sed -e ':a' -e '$!{N;ba' -e '}' -e "s/$(quoteRe "$old_coords")/$(quoteSubst "$new_coords")/" file2

See the online demo:

new_coords=$(sed -n '/Begin/,/End/{//b;p}' <<< "$file1")
new_coords=$(echo "${new_coords//ATOMIC_POSITIONS (angstrom)}" | sed '/^$/d')
old_coords=$(sed -n '/ATOMIC_POSITIONS/,/K_POINTS/{//b;p}' <<< "$file2")

quoteRe() { sed -e 's/[^^]/[&]/g; s/\^/\\^/g; $!a\'$'\n''\\n' <<<"$1" | tr -d '\n'; }
quoteSubst() {
  IFS= read -d '' -r < <(sed -e ':a' -e '$!{N;ba' -e '}' -e 's/[&/\]/\\&/g; s/\n/\\&/g' <<<"$1")
  printf %s "${REPLY%$'\n'}"
}

sed -e ':a' -e '$!{N;ba' -e '}' -e "s/$(quoteRe "$old_coords")/$(quoteSubst "$new_coords")/" <<< "$file2"

Output:

...
ATOMIC_SPECIES
C      12.0107 C.pbesol-n-kjpaw_psl.1.0.0.UPF
Co     58.933195 co_pbesol_v1.2.uspp.F.UPF
Fe     55.845 Fe.pbesol-spn-kjpaw_psl.0.2.1.UPF
ATOMIC_POSITIONS angstrom
Fe            1.0730540812        3.7648438571        1.4484500000
Fe            3.2976459188        0.6816561429        1.4484500000
Fe            3.2584040812        2.9049061429        0.0000000000
Fe            1.1122959188        1.5415938571        0.0000000000
C             2.1853500000        2.2232500000        1.4484500000
C             0.0000000000        0.0000000000       -0.0000000000
K_POINTS crystal
388
    0.0000000000     0.0000000000     0.0000000000 1
...

See related Is it possible to escape regex metacharacters reliably with sed.

  • Related