Home > OS >  Regexp finding duplicated character conditionally in incorrectly formatted JSON
Regexp finding duplicated character conditionally in incorrectly formatted JSON

Time:09-12

Suppose I have the following string:

example = '{"Hello": "world", "key_1": ""Double quote text"", "key_2", ""}'

This is not a properly created JSON object, thus when I try to load it:

json.loads(example)

I get an error JSONDecodeError: Expecting ',' delimiter: line 1 column 29 (char 28)

I've obviously tried to example.replace('""', '"'), but this, fixes key_1, but breaks key_2.

How can I remove the double quote in key_1, without removing it from key_2?

CodePudding user response:

you can use this pattern and python regex sub

re.sub(r"(\"\"([^\"]*)\"\")",r'"\2"',example)

this pattern will find any sets of ""string"" that have data between them and replace is it with a "string"

a cleaner pattern would be this:

'(\"{2}([^\"]*)\"{2})'
  • Related