I have a string of the below format
"ABCD XYZ
JSON1: {
'key1':'val',
'key2':{
'key2key1':'key2val1',
'key2key2':'key2val2'}
},
MNO
PQRS
JSON2{...}"
I need to extract Each JSON from such a string. I am not aware of the text before starting JSON. How to extract them
CodePudding user response:
Here is a quick example, the idea is to find the {
character.
From there we know that we are processing a JSON string and start storing next characters.
Then each time we find a }
we decrement a counter, and increment it when we find a {
.
When the counter gets to 0, we know this JSON has ended, we store it and move on to the next JSON string.
import java.util.ArrayList;
import java.util.List;
public class FindJson {
public static void main(final String[] args) {
String src = "ABCD XYZ JSON1: { 'key1':'val', 'key2':{ 'key2key1':'key2val1', 'key2key2':'key2val2'} }"
", MNO PQRS JSON2{...}";
StringBuilder jsonBuilder = new StringBuilder();
List<String> jsonStrings = new ArrayList<>();
int openingCurlyBraces = 0;
boolean jsonProcessing = false;
for (int i = 0; i < src.length(); i ) {
char current = src.charAt(i);
switch (current) {
case '{':
openingCurlyBraces ;
jsonProcessing = true;
break;
case '}':
openingCurlyBraces--;
break;
default:
break;
}
if (jsonProcessing) {
jsonBuilder.append(current);
if (openingCurlyBraces == 0) {
jsonStrings.add(jsonBuilder.toString());
jsonBuilder = new StringBuilder();
jsonProcessing = false;
}
}
}
System.out.println(jsonStrings);
}
}
Output of the list :
[{ 'key1':'val', 'key2':{ 'key2key1':'key2val1', 'key2key2':'key2val2'} }, {...}]
CodePudding user response:
JSON uses double-quoted strings. If you cannot change this, you will have to replace single quotes with double quotes.
Finding the beginning of a JSON object is easy: You can use a regexp:
Pattern re = Pattern.compile("JSON([0-9] ):");
Matcher matcher = re.matcher(input);
if (matcher.find()) {
// etc...
}
Finding the end of the JSON is less easy: you cannot use a regexp because it can contain nested structures. This solution defines a method extractJson
that finds the end of the object, and replaces single quotes with double quotes. The resulting string can be fed to your favorite JSON parser:
Matcher matcher = re.matcher(input);
int index = 0;
while (matcher.find(index)) {
int start = matcher.end();
StringBuilder buf = new StringBuilder();
index = extractJson(input, start, buf);
String json = buf.toString();
// do something with json
}
...
private static int extractJson(String input, int index, StringBuilder buf) {
int bracketLevel = 0;
int st = 0;
while (index < input.length()) {
char c = input.charAt(index );
switch (st) {
case 0:
switch (c) {
case '{':
buf.append(c);
bracketLevel;
break;
case '}':
buf.append(c);
--bracketLevel;
if (bracketLevel <= 0) {
return index;
}
break;
case '\'':
buf.append('"');
st = 1;
break;
default:
buf.append(c);
break;
}
break;
case 1:
switch (c) {
case '\'':
buf.append('"');
st = 0;
break;
case '"':
buf.append('\\');
buf.append(c);
break;
case '\\':
st = 2;
break;
default:
buf.append(c);
}
break;
case 2:
switch (c) {
case '\'':
buf.append(c);
st = 1;
break;
default:
buf.append('\\');
buf.append(c);
st = 1;
break;
}
break;
}
}
return index;
}