I'm trying to get the value as regex as follow:
from textx import metamodel_from_str
def test_get_hosts2():
grammar = r"""
config: ( /(?!host)./ | hosts =host | 'host' )* ;
host: 'host' host2name=/[0-9a-zA-Z.-] / '{'
(
'fixed-address' fixed_address=/([0-9]{1,3}\.){3}[0-9]{1,3}/';'
('option host-name' option_host_name=STRING';')?
('option domain-name-servers' option_domain_name_servers=/([0-9]{1,3}\.){3}[0-9]{1,3}, ([0-9]{1,3}\.){3}[0-9]{1,3}/';')?
('option netbios-name-servers' option_netbios_name_servers=/([0-9]{1,3}\.){3}[0-9]{1,3}/';')?
('option domain-name' option_domain_name=STRING';')?
)#
'}'
;
"""
conf_file = r"""
host corehost.abc.abc.ab {
fixed-address 172.124.106.10;
option host-name "hostname.abc.abc.ab";
option domain-name-servers 123.123.123.120, 123.123.128.142;
option netbios-name-servers 172.124.106.156;
option domain-name "abcm1.abc.abc.ab";
option domain-search "abcm1.abc.abc.ab", "abcmo2.abc.abc.ab", "abcmo.3abc.abc.ab", "abcmo4.abc.abc.ab";
}
host corehost2.abc.abc.ab {
fixed-address 172.124.106.120;
option host-name "hostname2.abc.abc.ab";
option domain-name-servers 123.123.123.220, 123.123.128.242;
option netbios-name-servers 172.124.106.256;
option domain-name "abcm2.abc.abc.ab";
option domain-search "abcm2.abc.abc.ab", "abcmo2.abc.abc.ab", "abcm.3abc.abc.ab", "abcm4.abc.abc.ab";
}
"""
mm = metamodel_from_str(grammar)
model = mm.model_from_str(conf_file)
print(model.hosts)
# assert len(model.hosts) == 2
for host in model.hosts:
print(host)
print(host.host2name, host.fixed_address, host.option_domain_name_servers, host.option_domain_search)
if __name__ == "__main__":
test_get_hosts2()
But I can get the only single value such as "fixed-address" and "host2name". In "domain-name-servers" I did with "," in regex. But I think it isn't the right way because the values are not same count. Could you help me to get the value of "domain-name-servers" and "domain-search" with right regex?
ref: Parsing dhcpd.conf with textX
CodePudding user response:
The easiest way is to use textX's repetition modifiers for matching a sequence of comma-separated values. Basically, whenever you match zero-or-more or one-or-more etc. you can add modifier in the square brackets. The most frequently used modifier is Separator modifier which basically is a match that is used between each two elements.
The other side bonuses instead of trying to match everything with a single regex are:
- simplicity (easier to maintain)
- you get a nice Python list of elements so you don't need to process the matched string further.
The working grammar would be (notice the use of [',']
which means one-or-more with a comma as a separator
):
config: ( /(?!host)./ | hosts =host | 'host' )* ;
host: 'host' host2name=/[0-9a-zA-Z.-] / '{'
(
'fixed-address' fixed_address=ip_addr';'
('option' 'host-name' option_host_name=STRING';')?
('option' 'domain-name-servers' option_domain_name_servers=ip_addr [',']';')?
('option' 'netbios-name-servers' option_netbios_name_servers=ip_addr [',']';')?
('option' 'domain-name' option_domain_name=STRING [',']';')?
('option' 'domain-search' option_domain_search=STRING [',']';')?
)#
'}';
ip_addr: /([0-9]{1,3}\.){3}[0-9]{1,3}/;