How to select section from a configuration file-CodePudding

I have a configuration file like this

interface xy
  disable
blabla
object name object-1
  param1 host 192.168.23.45
  param2 host 10.20.30.40
this is not an object
this is not an object    
object name object-2
  param1 network 10.200.192.23 255.255.255.0
  param2 network 10.1.39.0 255.255.192.0
   param3 range 172.34.12.45 172.34.12.47
this is not an object
  is not an object param

I need to select all od the sections begin with object followed with one or more lines indented by a space or several spaces. The expected output should be this one:

object name object-1
  param1 host 192.168.23.45
  param2 host 10.20.30.40
object name object-2
  param1 network 10.200.192.23 255.255.255.0
  param2 network 10.1.39.0 255.255.192.0
   param3 range 172.34.12.45 172.34.12.47

I wrote this regex /^object.*\n(^\s.*\n) /mg and it works fine when I validate it at https://regex101.com/ but it doesn't work when I try awk '/^object.*\n(^\s.*\n) /m' file.cfg nor this one example awk '/^object/,/^(\s.*\n)/' file.cfg. Can somebody explain me what is wrong?

CodePudding user response：

Modifying the input file to include 2x consecutive object blocks and a object-EOF block

$ cat file.cfg
interface xy
  disable
blabla
object name object-1
  param1 host 192.168.23.45
  param2 host 10.20.30.40
object name object-2
  param1 network 10.200.192.23 255.255.255.0
  param2 network 10.1.39.0 255.255.192.0
   param3 range 172.34.12.45 172.34.12.47
this is not an object
this is not an object
this is not an object
  is not an object param
object name object-2
  param1 network 10.200.192.23 255.255.255.0
  param2 network 10.1.39.0 255.255.192.0

One awk idea:

awk '
/^object/                        { printme=1 }
/^[[:alnum:]]/ && $1 != "object" { printme=0 }
printme
' file.cfg

This generates:

object name object-1
  param1 host 192.168.23.45
  param2 host 10.20.30.40
object name object-2
  param1 network 10.200.192.23 255.255.255.0
  param2 network 10.1.39.0 255.255.192.0
   param3 range 172.34.12.45 172.34.12.47
object name object-2
  param1 network 10.200.192.23 255.255.255.0
  param2 network 10.1.39.0 255.255.192.0

NOTE: Cyrus' sed comment/answer also generates this output for the modified input file.

CodePudding user response：

Using GNU awk for \S shorthand for [[:space:]]:

$ awk '/^\S/{f=/^object/} f' file
object name object-1
  param1 host 192.168.23.45
  param2 host 10.20.30.40
object name object-2
  param1 network 10.200.192.23 255.255.255.0
  param2 network 10.1.39.0 255.255.192.0
   param3 range 172.34.12.45 172.34.12.47

or with any awk:

awk '/^[^ \t]/{f=/^object/} f' file

Testing a regexp in regex101.com or any other web site proves that that regexp works in that web site, nothing else. In particular it doesn't mean it'll work in any command-line tool (e.g. awk, sed, or grep) as every tool uses different regexp varieties (e.g. BRE, ERE, or PCRE) with different extensions, different options, different delimiters, etc.

Your regexps were failing as they're using non-standard extensions (\s and whatever /m is supposed to mean) and assuming the whole of the input is in memory when in fact awk reads 1 line at a time by default, and are anchored to the start of the input (by ^) when you want them to match mid-input.

There is no BRE or ERE that would ONLY match what you're trying to match. Maybe you could figure out a PCRE to do it but trying to solve your problem with just a regexp is far too complicated for the simple thing you're trying to do vs writing a tiny script to do it as shown above.

Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems..