long time ago I wrote a tool for parsing text files, line by line, and do some stuff, depending on commands and conditions in the file. I used regex for this, however, I was never good in regex.
A line holding a condition looks like this:
[type==STRING]
And the regex I use is:
re.compile(r'^[^\[\]]*\[([^\]\[=]*)==([^\]\[=]*)\][^\]\[]*$', re.MULTILINE)
This regex would result me the keyword "type" and the value "STRING".
However, now I need to update my tool to have more conditions in one line, e.g.
[type==STRING][amount==0]
I need to update my regex to get me two pairs of results, one pair type/STRING and one pair amount/0. But I'm lost on this. My regex above gets me zero results with this line.
Any ideas how to do this?
CodePudding user response:
You could either match a second pair of groups:
^[^\[\]]*\[([^\]\[=]*)==([^\]\[=]*)\][^\]\[]*(?:\[([^\]\[=]*)==([^\]\[=]*)\][^\]\[]*)?$
Or you can omit the anchors and the [^\[\]]*
part to get the group1 and group 2 values multiple times:
\[([^\]\[=]*)==([^\]\[=]*)\]
CodePudding user response:
Is it a requirement that you use regex? You can alternatively accomplish this pretty easily using the split function twice and stripping the first opening and last closing bracket.
line_to_parse = "[type==STRING]"
# omit the first and last char before splitting
pairs = line_to_parse[1:-1].split("][")
for pair in pairs:
x, y = pair.split("==")
CodePudding user response:
Rather depends on the precise "rules" that describe your data. However, for your given data why not:
import re
text = '[type==STRING][amount==0]'
words = re.findall('\w ', text)
lst = []
for i in range(0, len(words), 2):
lst.append((words[i], words[i 1]))
print(lst)
Output:
[('type', 'STRING'), ('amount', '0')]