I am trying to make a string JSON compliant by enclosing the attribute into double quotes within python3 so I can parse the JSON.
The regex seems not to work as intended:
re.sub(r"/(?:[\w] (?=:))/g", "", var)
String for variable var
:
{
pzn: "09900426",
url: "/url.html",
packageSizeValue: "28",
packageSizeUnit: "St",
sizeValue: "",
potency: ""
}, {
pzn: "09900432",
url: "/url2.html",
packageSizeValue: "84",
packageSizeUnit: "St",
sizeValue: "",
potency: ""
}
Expected result would be:
{
"pzn": "09900426",
"url": "/url.html",
...
CodePudding user response:
It look as though you wanted to use
re.sub(r"\w (?=:)", r'"\g<0>"', var)
The \w (?=:)
pattern matches any one or more word chars before a :
, and the match is replaced with itself enclosed with double quotation marks. See the regex demo.
This is not safe because there can be such matches inside string values. Imagine a potency: "blah:blah:"
value. In this case, it would be safer to leverage the indentation and use
re.sub(r"(?m)^(\s*)(\w ):", r'\1"\2":', var)
See the regex demo. Here,
(?m)
- equivalent ofre.M
/
re.MULTILINE` option^
- start of a line(\s*)
- Group 1 (\1
): zero or more whitespaces(\w )
- Group 2: one or more word chars:
- a colon.