Home > Back-end >  Search 2 consecutive lines and then look for next pattern before printing lines in between
Search 2 consecutive lines and then look for next pattern before printing lines in between

Time:02-04

I have few hundreds of files with the following format:

File1:

##########
##TestB 
##########
y-81=9
Test B is for another test
** Few Thousands of lines**
##########
##########
##TestA ##
##########
Test A is for testing correctiveness of the code
x-10=92
** Few thousands of lines***
###########
###########
##TestZ
###########
M=1239
N=132
X=0x824
***Few Thousands of lines**
############

Files2:

##########
##TestB 
##########
y-81=9
Test B is for another test
** Few Thousands of lines**
##########
###########
##TestZ
###########
M=1239
N=132
X=0x824
***Few Thousands of lines**
############
##########
##TestA ##
##########
Test A is for testing correctiveness of the code
x-10=92
** Few thousands of lines***
###########

File3:

##########
##TestA ##
##########
Test A is for testing correctiveness of the code
x-10=92
** Few thousands of lines***
###########    
##TestB 
##########
y-81=9
Test B is for another test
** Few Thousands of lines**
##########
###########
##TestZ
###########
M=1239
N=132
X=0x824
***Few Thousands of lines**
############

I am trying to add comments infront of the lines in between ##TestA## block in each file and then add an additional line as below:

File1:

##########
##TestB 
##########
y-81=9
Test B is for another test
** Few Thousands of lines**
##########
##########
##TestA ##
##########
##Test A is for testing correctiveness of the code
##x-10=92
##** Few thousands of lines***
This is the new line we added
###########
###########
##TestZ
###########
M=1239
N=132
X=0x824
***Few Thousands of lines**
############

Files2:

 ##########
 ##TestB 
 ##########
 y-81=9
 Test B is for another test
 ** Few Thousands of lines**
 ##########
 ###########
 ##TestZ
 ###########
 M=1239
 N=132
 X=0x824
 ***Few Thousands of lines**
 ############
 ##########
 ##TestA ##
 ##########
 ##Test A is for testing correctiveness of the code
 ##x-10=92
 ##** Few thousands of lines***
 This is the new line we added
 ###########

File3:

##########
##TestA ##
##########
##Test A is for testing correctiveness of the code
##x-10=92
##** Few thousands of lines***
This is the new line we added
###########    
##TestB 
##########
y-81=9
Test B is for another test
** Few Thousands of lines**
##########
###########
##TestZ
###########
M=1239
N=132
X=0x824
***Few Thousands of lines**
############

There are approximately 1000 lines in each block. I am trying to find how to match 2 consecutive lines (i.e. ##TestA## and then the next line #####). Once this pattern is found, store the contents in some buffer untill we see the next End of the Block (i.e.: ##########). Then we will add ## infront of each line in the buffer and add a newline at the end (i.e.: "This is the new line we added").

I have tried the following:

sed -n '/##TestA*/{N;/##/{/##/n;p}}' file.txt

However, this only prints the first line from the block.

CodePudding user response:

An approach using awk. The loop is an example how to get the files, replace it with your favorite method.

% for i in file[123];do 
    echo -e "Working on $i\n"
    awk '/^#.*TestA/{x  }
         /^##########/ && x > 0{x  } 
         x==3{print "This is the new line we added <---------"; x  }1' "$i"
    echo -e "DONE\n"
  done
Working on f1

##########
##TestB
##########
y-81=9
Test B is for another test
** Few Thousands of lines**
##########
##########
##TestA ##
##########
Test A is for testing correctiveness of the code
x-10=92
** Few thousands of lines***
This is the new line we added <---------
###########
###########
##TestZ
###########
M=1239
N=132
X=0x824
***Few Thousands of lines**
############
DONE

Working on f2

##########
##TestB
##########
y-81=9
Test B is for another test
** Few Thousands of lines**
##########
###########
##TestZ
###########
M=1239
N=132
X=0x824
***Few Thousands of lines**
############
##########
##TestA ##
##########
Test A is for testing correctiveness of the code
x-10=92
** Few thousands of lines***
This is the new line we added <---------
###########
DONE

Working on f3

##########
##TestA ##
##########
Test A is for testing correctiveness of the code
x-10=92
** Few thousands of lines***
This is the new line we added <---------
###########
##TestB
##########
y-81=9
Test B is for another test
** Few Thousands of lines**
##########
###########
##TestZ
###########
M=1239
N=132
X=0x824
***Few Thousands of lines**
############
DONE

CodePudding user response:

Let so.bash be:

#!/bin/bash

awk -f so.awk input.txt

Where input.txt is a file you want to process.

File so.awk is:

BEGIN { ina = 0 }
/##TestA/ {
    ina = 1
    print $0
    getline
    print $0
    next
}
/##########/ && (ina == 1) {
    print "This is the new line we added"
    ina = 0
}
{
    if (ina == 1)
        print "#" $0
    else
        print $0
}
  • BEGIN ...: set variable ina to 0. Variable ina is used to know if you are "inside" a "##TestA" block. This block ends when the second "##########" line is seen. The "##########" line right after "##TestA" does not indicate the end of a "##TestA" block, it is part of the block.

  • /##TestA/: when this pattern is seen, set variable ina to 1. Then print the line. getline is used to get the next line right away (that line is always "##########") and print it. next indicates that the rest of the script is ignored for this line.

  • /##########/ && (ina == 1): when a "########" line is seen, and we are inside a "##TestA" block, print the new line and set ina == 0 (no longer in a "##TestA" block).

  • the last action is to check if we are in a "##TestA" block. If yes, add a "#" in front of the line. If not, print the line as is.

The so.bash script can be modified to loop over your many input files.

  • Related