Home > Blockchain >  How to get only first element in list contained in string?
How to get only first element in list contained in string?

Time:04-19

I've got a long string. This string contains a list, like such example

'[{"ex1": 0, "ex2":1}, {"ex3": 2, "ex4":3}]'

I can use json5.loads and then get the first element by using [0] on the list, but json5.loads takes a long time for longer strings. Is there a way to get just the first element without loading the entire list? (in this example it would be {"ex1": 0, "ex2":1}. Splitting by commas doesn't work for me since there are commas contained in dictionaries in the list. Thanks.

CodePudding user response:

If it'll definitely be that format, you can just search for the beginning and ending brackets.

mystr = '[{"ex1": 0, "ex2":1}, {"ex3": 2, "ex4":3}]'
first = mystr.index("{")
last = mystr.index("}")
extracted = mystr[first:last 1]
print(extracted)

this prints '{"ex1": 0, "ex2":1}'

For a more complicated string:

mystr = '[{"ex1": {"ex1.33": -1, "ex1.66": -2}, "ex2":1}, {"ex3": 2, "ex4":3}]'
n_open = 0
n_close = 0
first = mystr.index("{")
for ii in range(len(mystr)):
    if mystr[ii] == "{":
        n_open  = 1
    elif mystr[ii] == "}":
        n_close  = 1
    if n_open > 0 and n_open == n_close:
        break
extracted = mystr[first:ii 1]

CodePudding user response:

Does your string work with ast.literal_eval()? If it does, you could do

obj = ast.literal_eval(s)
# obj[0] gives the first dict

If not, you could loop through the string character-by-character and yield any substring when the number of open-brackets are equal to the number of close-brackets.

def get_top_level_dict_str(s):
  open_br = 0
  close_br = 0
  open_index = 0
  for i, c in enumerate(s):
    if c == '{':
        if open_br == 0: open_index = i 
        open_br  = 1
    elif c == '}':
        close_br  = 1
        if open_br > 0 and open_br == close_br:
            yield s[open_index:i 1]
            open_br = close_br = 0

If you want to parse the resulting substrings to objects, you could use json5 like you already do, which is probably faster on the smaller string, or use ast.literal_eval()

x = get_top_level_dict_str(s)
# next(x) gives the substring
# then use json5 or ast.literal_eval()
  • Related