I have a text (input.txt) file containing the following pattern:
#########
##### AV1
#########
great picture
good tv
decent sound
vibrant color
#########
#########
#### AV2 new TV
#########
we are testing this out now
this is 4K
need HDMI
#########
#########
### AV3 not working
#########
not enough ports
buy new device
#########
What I am in need for is the following output of 3 seperate files:
File 1.txt
#########
##### AV1
#########
great picture
good tv
decent sound
vibrant color
#########
File 2.txt
#########
#### AV2 new TV
#########
we are testing this out now
this is 4K
need HDMI
#########
File 3.txt
#########
### AV3 not working
#########
not enough ports
buy new device
#########
I am not sure how to search and match patterns in 3 consecutive lines and then ignore the texts in between and find the #### at the end. Then I need to take the texts in between and output a new file.
I am able to "greedily" match the first 3 lines with the following regex:
/^#*\_^#*\_^#.*$
CodePudding user response:
You may use this awk
:
awk '
/^#{7,}/ {
n
if (n%3 == 1)
fn = "file" c ".txt"
}
{
print > fn
}' file
CodePudding user response:
When your requirement may be formulated as "split after the first line of each pair of lines with 9 hashes", you can use
csplit -b "%d" --suppress-matched -f "File " \
<(sed -rz 's/(#{9})\n\1/\1\n\n\1/g' input.txt) /^$/ '{*}'
Explanation:
-b "%d"
: Get filenumbers without leading 0
--suppress-matched
: Skip the empty lines that will be inserted
-f "File "
: basename of the files created
<(...)
: use output of the command like it is a file
sed -rz 's/(#{9})\n\1/\1\n\n\1/g' input.txt
: create an empty line between 2 patterns with 9 times #
/^$/
: Empty line
'{*}'
: Repeat csplit pattern for the complete file