Home > Blockchain >  Regular Expression to extract JSON objects from array
Regular Expression to extract JSON objects from array

Time:11-18

I'm working on a custom JSON deserializer in Java and would like to create an ArrayList of objects specified in such a .json file. For example, given the following file:

[
    {
        "name": "User1",
        "gender": "M"
    },
    {
        "name": "User2",
        "gender": "F"
    }
]

(...) I want my Java program to create a structure of two objects of class User, each of it holding the corresponding fields.

I managed to do it with only one value mentioned in the file (so no JSON array, just an object between {} and some key-value pairs), but with a list it gets more complicated. Thought about splitting the whole JSON array into all its elements, and apply my single JSON parsing algorithm to each of them, and then add them to an ArrayList.

My idea should work, but my problem is, I'm not that sure on how to properly split this array of JSONs using Java's split() method for strings. I'm also not that good at regex expressions to think for a proper one.

Thought about splitting it based on: content.split("},"), and then appending the last } to the final element, but this is going to also split inside members of my JSON elements if they reference to other objects.

My question would be, what would be a proper regex, in this context, that is going to make Java properly split my JSON array into multiple JSON elements?

I can remove the brackets from the beginning and from the end of the file, this shouldn't be a problem as it only requires easy String manipulation, but I also want a String[] array, each one containing one of my two users, together with their data.

Expected output:

String1: { "name": "User1", "gender": "M" }
String2: { "name": "User2", "gender": "F" }

CodePudding user response:

If it's pretty formatted as per your question, you can use:

(?s)(?<=^    )\{.*?(?<=^    )}

Here's some test code:

String input ="[\n"  
        "    {\n"  
        "        \"name\": \"User1\",\n"  
        "        \"gender\": \"M\"\n"  
        "    },\n"  
        "    {\n"  
        "        \"name\": \"User2\",\n"  
        "        \"gender\": \"F\"\n"  
        "    }\n"  
        "]";
List<String> jsonObjects = Pattern.compile("(?sm)(?<=^    )\\{.*?(?<=^    )}")
  .matcher(input).results()
  .map(MatchResult::group)
  .map(str -> str.replaceAll("[\s\n]*(?!\",)", "")) // remove whitespace
  .collect(toList());

Output:

{"name":"User1","gender":"M"}
{"name":"User2","gender":"F"}
  • Related