Home > Back-end >  Extract multiple JSONs from a string in JAVA
Extract multiple JSONs from a string in JAVA

Time:12-07

I have a string of the below format

"ABCD XYZ
JSON1: {
'key1':'val',
'key2':{
'key2key1':'key2val1',
'key2key2':'key2val2'}
},
MNO
PQRS
JSON2{...}"

I need to extract Each JSON from such a string. I am not aware of the text before starting JSON. How to extract them

CodePudding user response:

Here is a quick example, the idea is to find the { character.

From there we know that we are processing a JSON string and start storing next characters.

Then each time we find a } we decrement a counter, and increment it when we find a { .

When the counter gets to 0, we know this JSON has ended, we store it and move on to the next JSON string.

import java.util.ArrayList;
import java.util.List;

public class FindJson {

    public static void main(final String[] args) {

        String src = "ABCD XYZ         JSON1: { 'key1':'val', 'key2':{ 'key2key1':'key2val1',   'key2key2':'key2val2'} }"
                  ",  MNO  PQRS   JSON2{...}";

        StringBuilder jsonBuilder = new StringBuilder();

        List<String> jsonStrings = new ArrayList<>();

        int openingCurlyBraces = 0;
        boolean jsonProcessing = false;

        for (int i = 0; i < src.length(); i  ) {

            char current = src.charAt(i);

            switch (current) {

            case '{':
                openingCurlyBraces  ;
                jsonProcessing = true;
                break;
            case '}':
                openingCurlyBraces--;

                break;
            default:
                break;

            }

            if (jsonProcessing) {
                jsonBuilder.append(current);

                if (openingCurlyBraces == 0) {

                    jsonStrings.add(jsonBuilder.toString());
                    jsonBuilder = new StringBuilder();
                    jsonProcessing = false;

                }
            }

        }

        System.out.println(jsonStrings);

    }

}

Output of the list :

[{ 'key1':'val', 'key2':{ 'key2key1':'key2val1', 'key2key2':'key2val2'} }, {...}]

CodePudding user response:

JSON uses double-quoted strings. If you cannot change this, you will have to replace single quotes with double quotes.

Finding the beginning of a JSON object is easy: You can use a regexp:

Pattern re = Pattern.compile("JSON([0-9] ):");
Matcher matcher = re.matcher(input);
if (matcher.find()) {
   // etc...
}

Finding the end of the JSON is less easy: you cannot use a regexp because it can contain nested structures. This solution defines a method extractJson that finds the end of the object, and replaces single quotes with double quotes. The resulting string can be fed to your favorite JSON parser:

    Matcher matcher = re.matcher(input);
    int index = 0;
    while (matcher.find(index)) {
        int start = matcher.end();
        StringBuilder buf = new StringBuilder();
        index = extractJson(input, start, buf);
        String json = buf.toString();
        // do something with json
    }

...

private static int extractJson(String input, int index, StringBuilder buf) {
    int bracketLevel = 0;
    int st = 0;
    while (index < input.length()) {
        char c = input.charAt(index  );
        switch (st) {
            case 0:
                switch (c) {
                    case '{':
                        buf.append(c);
                          bracketLevel;
                        break;
                    case '}':
                        buf.append(c);
                        --bracketLevel;
                        if (bracketLevel <= 0) {
                            return index;
                        }
                        break;
                    case '\'':
                        buf.append('"');
                        st = 1;
                        break;
                    default:
                        buf.append(c);
                        break;
                }
                break;
            case 1:
                switch (c) {
                    case '\'':
                        buf.append('"');
                        st = 0;
                        break;
                    case '"':
                        buf.append('\\');
                        buf.append(c);
                        break;
                     case '\\':
                        st = 2;
                        break;
                    default:
                        buf.append(c);
                }
                break;
            case 2:
                switch (c) {
                    case '\'':
                        buf.append(c);
                        st = 1;
                        break;
                    default:
                        buf.append('\\');
                        buf.append(c);
                        st = 1;
                        break;
                }
                break;
        }
    }
    return index;
}
  • Related