Home > Net >  Using regex to extract a string including a space between other strings and whitespace
Using regex to extract a string including a space between other strings and whitespace

Time:03-31

I have the following output from my wireless controller and also the regex statement below. I am trying to parse out the various values using regex named capturing groups. The space in the 'Global/whatever Lab/Lab01' value is throwing everything after that value off. Is there a way to repalce the \S after the group to capture the whole value of 'Global/whatever Lab/Lab01'? Thank you.

Number of APs: 2\nAP Name                            Slots    AP Model  Ethernet MAC    Radio MAC       Location                          Country     IP Address                                 State         \nAPAC4A.56BE.18A0                     2      9120AXI   ac4a.56be.18a0  045f.b91a.0a40  Global/whatever Lab/Lab01  US          2.2.2.2                                Registered    \nAPHAV-LAB-TEST-01                    2      9120AXI   ac4a.56be.8cd4  045f.b91d.4ce0  default location                  US          1.1.1.1                               Registered
(?P<ap_name>\S )\s (?P<slots>\d )\s (?P<model_number>\S )\s (?P<ether_mac>\S )\s (?P<radio_mac>\S )\s (?P<location>\S )\s(?P<country>\S )\s (?P<ip_address>\S )?\s (?P<state>\S )

CodePudding user response:

maybe try doing something similar to replacing \S with [\S ] I don't know if there's anything like an escape code you can use to represent the space between the []

CodePudding user response:

When you need to match a multi-word field value, make sure you can describe the format of the field(s) next to it. Once you know the rules, you can match the "unknown" field with a mere .*? pattern.

See an example solution:

(?P<ap_name>\S )\s (?P<slots>\d )\s (?P<model_number>\S )\s (?P<ether_mac>\S )\s (?P<radio_mac>\S )\s (?P<location>.*?)\s (?P<country>[A-Z]{2,})(?:\s (?P<ip_address>\d{1,3}(?:\.\d{1,3}){3}))?\s (?P<state>\S )

See the regex demo.

Now, the location group pattern is (?P<location>.*?) and it matches any char, 0 or more occurrences but as few times as possible, other than line break chars, and it is possible here since the next group pattern, country group, is now (?P<country>[A-Z]{2,}) and matches any substring of two or more uppercase ASCII letters.

Note I also "spelled out" the ip_address group pattern and made the whole part with initial whitespaces optional, (?:\s (?P<ip_address>\d{1,3}(?:\.\d{1,3}){3}))?.

  • Related