Home > database >  Regex to match within specific block
Regex to match within specific block

Time:03-17

I am trying to match a string between two other strings. The document looks something like this (there are many more lines in the real config):

#config-version=user=user1
#conf_file_ver=1311784161
#buildno=123
#global_vdom=adsf
config system global
    set admin-something
    set admintimeout 8289392839823
    set alias "F5"
    set gui-theme mariner
    set hostname "something"
end
config system accprofile
    edit "prof_admin"
        set secfabgrp read
        set ftviewgrp read
        set vpngrp read
        set utmgrp read
        set wifi read
    next
end
config system np6xlite
    edit "np6xlite_0"
    next
end
config system interface
    edit "dmz"
        set vdom "asdf"
        set ip 1.1.1.1 255.255.255.0
        set type physical
        set role dmz
    next
    edit "wan1"
        set vdom "root"
        set ip 2.2.2.2 255.255.255.255
        set type physical
        set alias "jklk5"
        set role wan
    next
end
config system physical-switch
    edit "sw0"
        set age-val 0
    next
end
config system virtual-switch
    edit "lan"
        set physical-switch "sw0"
        config port
            edit "port2"
            next
            edit "port3"
            next
            edit "port4"
            next
            edit "port5"
            next
            edit "port6"
            next
        end
    next
end
config system custom-language
    edit "en"
        set filename "en"
    next
    edit "fr"
        set filename "fr"
    next
end
config system admin
    edit "user1"
        set vdom "root"
        set password ENC SH2Tb1/aYYJB2U9ER2f5Ykj1MtE6U=
    next
    edit "user2"
        set trusthost1 255.255.255.255 255.255.255.224
        set trusthost2 255.255.255.254 255.255.255.224
    next
end
config system ha
    set override
end
config system replacemsg-image
    edit "logo_fnet"
        set image-type gif
        set image-base64 ''
    next
    edit "logo_fguard_wf"
        set image-type gif
        set image-base64 ''
    next
    edit "logo_fw_auth"
        set image-base64 ''
    next
    edit "logo_v2_fnet"
        set image-base64 ''
    next
    edit "logo_v2_fguard_wf"
        set image-base64 ''
    next
    edit "logo_v2_fguard_app"
        set image-base64 ''
    next
end

I care about every "edit" block between "config system admin" and its corresponding "end". Each "edit" block represents a user and I need to know if a user block (edit "" ...stuff on new lines... next) is missing the "set password" line.

This expression (multiline) captures the "edit "en"..." under "config system custom-language":

\h*edit ".*\n(?:\h* (?!next|set password).*\n)*\h*next\n

Now I need to make sure to ignore any config sections before or after "config system admin". I tried this:

(?<=config system admin\n)\h*edit ".*\n(?:\h* (?!next|set password).*\n)*\h*next\n(?=end)

That change results in zero matches. But if I change the lookbehind to:

(?<=config system custom-language\n)

Then I get a match, but it is in the wrong config block again. I tried sticking [\S\s] in front, but that results in zero matches:

[\S\s](?<=config system admin\n)\h*edit ".*\n(?:\h* (?!next|set password).*\n)*\h*next\n(?=end)

How do I take the "set password" matching and make sure it only happens in between "config system admin" and its corresponding "end". I only need the first result, but getting multiple is fine. I am using PCRE2.

CodePudding user response:

The following pattern will starts with edit, stops before end or edit, and will not allow password, config system or set filename.
It is a bit long and clumsy but it does find regular users if the word password is absent and does not match the 2 opening blocks.
As noted it the comments it could malfunction if the keywords are found elsewhere in the file.

/edit((?!edit)(?!(edit|password|config sys|set filename))[\w\W])*(?=(edit|end))/gm

If you have the possibility to use a simple script, bash for example, that could read line by line we could build something simple that would be more reliable.

CodePudding user response:

You can use:

(?<=config system admin.*?edit "[^"] ")(?!.*?set password.*?next).*?(?=next.*?end)

It requires the global and singleline flags. If you can't use singleline, replace dot (.) with [\s\S].

Explanation:

(?<=config system admin.*?edit "[^"] ") - look behind for 'config system admin' followed by any characters (non greedy), followed by 'edit' and a username

(?!.*?set password.*?next) - look ahead for NOT 'set password', followed by any characters and 'next'

.*? - any characters

(?=next.*?end) - look ahead for 'next' and finally 'end'.

CodePudding user response:

I think you want to work on this task from two levels. First, find the data that is in those config blocks, and then examine the users within them.

Here's something that is far simpler that may do what you need.

First, you want to look only at the lines between "config system admin" and "end", so use awk to find those.

$ awk '/^config system admin/,/^end/' config.txt
config system admin
    edit "user1"
        set vdom "root"
        set password ENC SH2Tb1/aYYJB2U9ER2f5Ykj1MtE6U=
    next
    edit "user2"
        set trusthost1 255.255.255.255 255.255.255.224
        set trusthost2 255.255.255.254 255.255.255.224
    next
end

Now search those results for either "edit" or "set password":

$ awk '/^config system admin/,/^end/' config.txt | grep -E 'edit|set password'
    edit "user1"
        set password ENC SH2Tb1/aYYJB2U9ER2f5Ykj1MtE6U=
    edit "user2"

You can now eyeball the results and see who has set a password and who hasn't.

If you need to get more precise, then you can write a little more code to find "edit" lines that aren't followed by "set password".

In any case, the key is to break the problem into smaller problems.

  • Related