I have 10,000 files(molecule1.pdbqt ... molecule10000.pdbqt). Only some of them contains second occurrence of a keyword TORSDOF. For a given file, I want to remove all lines following the second occurrence, if there, including the line containing the second occurrence of keyword TORSDOF, while preserving the file names. Can somebody please provide a sample snippet, if possible without loop(s). Thank you.
$ cat inputExample.txt
ashu
vishu
jyoti
TORSDOF
Jatin
Vishal
Shivani
TORSDOF
Sushil
Kiran
$ cat outputExample.txt
ashu
vishu
jyoti
TORSDOF
Jatin
Vishal
Shivani
CodePudding user response:
You can use awk
for this:
$ awk '/TORSDOF/&&c >0 {next} 1' inputExample.txt
ashu
vishu
jyoti
TORSDOF
Jatin
Vishal
Shivani
Sushil
Kiran
Based on exactly the same question outside SO.
CodePudding user response:
This could be done as
cat input.txt | tr '\n' '|' | sed 's/TORSDOF|//2g' | tr '|' '\n' > output.txt
cat input.txt
to print the file contenttr '\n' '|'
to form a single line stringsed 's/TORSDOF|//2g'
to replace the second and onward occurrence of the keywordtr '|' '\n'
to split the long line string into multi-line file> output.txt
to output the file