Home > Blockchain >  Match patterns in files and cut/delete lines in between
Match patterns in files and cut/delete lines in between

Time:01-13

I have a text (input.txt) file containing the following pattern:

#########
##### AV1 
#########
great picture
good tv
decent sound
vibrant color
#########
#########
#### AV2 new TV
#########
we are testing this out now
this is 4K
need HDMI
#########
#########
### AV3 not working
#########
not enough ports
buy new device
#########

What I am in need for is the following output of 3 seperate files:

File 1.txt

#########
##### AV1 
#########
great picture
good tv
decent sound
vibrant color
#########

File 2.txt

#########
#### AV2 new TV
#########
we are testing this out now
this is 4K
need HDMI
#########

File 3.txt

#########
### AV3 not working
#########
not enough ports
buy new device
#########

I am not sure how to search and match patterns in 3 consecutive lines and then ignore the texts in between and find the #### at the end. Then I need to take the texts in between and output a new file.

I am able to "greedily" match the first 3 lines with the following regex:

/^#*\_^#*\_^#.*$

CodePudding user response:

You may use this awk:

awk '
/^#{7,}/ {
     n
   if (n%3 == 1)
      fn = "file"   c ".txt"
}
{
   print > fn
}' file

CodePudding user response:

When your requirement may be formulated as "split after the first line of each pair of lines with 9 hashes", you can use

csplit -b "%d"  --suppress-matched -f "File " \
  <(sed -rz 's/(#{9})\n\1/\1\n\n\1/g' input.txt) /^$/ '{*}'

Explanation:
-b "%d": Get filenumbers without leading 0
--suppress-matched: Skip the empty lines that will be inserted
-f "File ": basename of the files created
<(...): use output of the command like it is a file
sed -rz 's/(#{9})\n\1/\1\n\n\1/g' input.txt: create an empty line between 2 patterns with 9 times #
/^$/: Empty line
'{*}': Repeat csplit pattern for the complete file

  • Related