Home > Software design >  Reading JSON with id outside curly brackets and no quotation marks
Reading JSON with id outside curly brackets and no quotation marks

Time:11-20

I am having some troubles importing a json file to R with the following format:

C135-HR2459 {"number_a": 1, "number_b":2} 
C156-HR2249 {"number_a": 1, "number_b":2} 

It would have worked if it had the following format:

{"id": C135-HR2459, "number_a": 1, "number_b":2} 
{"id": C156-HR2249, "number_a": 1, "number_b":2} 

CodePudding user response:

You may want to read it in as pure text with readLines and then massage it with regex before try to read as JSON.

First remove the misplaced "{" and then add '{"id: "' to the beginning of each element:

sub("^", "{id: ", sub("\\{", "", txt))
[1] "{id: C135-HR2459\t\"number_a\": 1, \"number_b\":2}" "{id: C156-HR2249\t\"number_a\": 1, \"number_b\":2}"

The elimination of the interior "{" is done by the "inner sub. R functions need to be read and understood from the inside to the outside.

CodePudding user response:

using regex you can search for this:

C(.*?)\-HR(.*?) \{(.*?)\}

and replace it with this:

\{\"id\"\:\"C\1\-HR\2\", \3 \}

Be aware that the result will be :

{"id":"C135-HR2459", "number_a": 1, "number_b":2 } 
{"id":"C156-HR2249", "number_a": 1, "number_b":2 } 

where

the value for the id MUST be between "" since is a string

also this here:

{"id": C135-HR2459, "number_a": 1, "number_b":2} 
{"id": C156-HR2249, "number_a": 1, "number_b":2} 

is not a valid json, because the id value is not between quotes

CodePudding user response:

Using Notepad , you can do:

  • Ctrl H
  • Find what: ^(\S ) {
  • Replace with: {"id": "$1",
  • CHECK Wrap around
  • CHECK Regular expression
  • UNCHECK . matches newline
  • Replace all

Explanation:

^           # beginning of line
(\S )       # group 1, 1 or more non space character; you can use (. ?) if the string contains spaces
            # 1 space
{           # open curly brace

Replacement:

{"id": "    # literally
$1          # content of group 1
",          # literally, there is a space after the comma

Screenshot (before):

enter image description here

Screenshot (after):

enter image description here

  • Related