I have a configuration file like this
interface xy
disable
blabla
object name object-1
param1 host 192.168.23.45
param2 host 10.20.30.40
this is not an object
this is not an object
object name object-2
param1 network 10.200.192.23 255.255.255.0
param2 network 10.1.39.0 255.255.192.0
param3 range 172.34.12.45 172.34.12.47
this is not an object
is not an object param
I need to select all od the sections begin with object followed with one or more lines indented by a space or several spaces. The expected output should be this one:
object name object-1
param1 host 192.168.23.45
param2 host 10.20.30.40
object name object-2
param1 network 10.200.192.23 255.255.255.0
param2 network 10.1.39.0 255.255.192.0
param3 range 172.34.12.45 172.34.12.47
I wrote this regex /^object.*\n(^\s.*\n) /mg
and it works fine when I validate it at https://regex101.com/ but it doesn't work when I try awk '/^object.*\n(^\s.*\n) /m' file.cfg
nor this one example awk '/^object/,/^(\s.*\n)/' file.cfg
. Can somebody explain me what is wrong?
CodePudding user response:
Modifying the input file to include 2x consecutive object
blocks and a object
-EOF block
$ cat file.cfg
interface xy
disable
blabla
object name object-1
param1 host 192.168.23.45
param2 host 10.20.30.40
object name object-2
param1 network 10.200.192.23 255.255.255.0
param2 network 10.1.39.0 255.255.192.0
param3 range 172.34.12.45 172.34.12.47
this is not an object
this is not an object
this is not an object
is not an object param
object name object-2
param1 network 10.200.192.23 255.255.255.0
param2 network 10.1.39.0 255.255.192.0
One awk
idea:
awk '
/^object/ { printme=1 }
/^[[:alnum:]]/ && $1 != "object" { printme=0 }
printme
' file.cfg
This generates:
object name object-1
param1 host 192.168.23.45
param2 host 10.20.30.40
object name object-2
param1 network 10.200.192.23 255.255.255.0
param2 network 10.1.39.0 255.255.192.0
param3 range 172.34.12.45 172.34.12.47
object name object-2
param1 network 10.200.192.23 255.255.255.0
param2 network 10.1.39.0 255.255.192.0
NOTE: Cyrus' sed
comment/answer also generates this output for the modified input file.
CodePudding user response:
Using GNU awk for \S
shorthand for [[:space:]]
:
$ awk '/^\S/{f=/^object/} f' file
object name object-1
param1 host 192.168.23.45
param2 host 10.20.30.40
object name object-2
param1 network 10.200.192.23 255.255.255.0
param2 network 10.1.39.0 255.255.192.0
param3 range 172.34.12.45 172.34.12.47
or with any awk:
awk '/^[^ \t]/{f=/^object/} f' file
Testing a regexp in regex101.com or any other web site proves that that regexp works in that web site, nothing else. In particular it doesn't mean it'll work in any command-line tool (e.g. awk, sed, or grep) as every tool uses different regexp varieties (e.g. BRE, ERE, or PCRE) with different extensions, different options, different delimiters, etc.
Your regexps were failing as they're using non-standard extensions (\s
and whatever /m
is supposed to mean) and assuming the whole of the input is in memory when in fact awk reads 1 line at a time by default, and are anchored to the start of the input (by ^
) when you want them to match mid-input.
There is no BRE or ERE that would ONLY match what you're trying to match. Maybe you could figure out a PCRE to do it but trying to solve your problem with just a regexp is far too complicated for the simple thing you're trying to do vs writing a tiny script to do it as shown above.