Home > Software design >  How to fix the output for converting to JSON
How to fix the output for converting to JSON

Time:06-07

I wrote a code in python that converts a file with these objects to JSON. It converts into the proper json format but the output is not exactly what I need.

{ 
    name: (sindey, crosby)
    game: "Hockey"
    type: athlete
},
{ 
    name: (wayne, gretzky)
    game: "Ice Hockey"
    type: athlete
}

Code:

import json

f = open("log.file", "r")
content = f.read()
splitcontent = content.splitlines()

d = []
for line in splitcontent:
    appendage = {}
    if ('}' in line) or ('{' in line):
        # Append a just-created record and start a new one
        continue

    d.append(appendage)
    key, val = line.split(':')

    if val.endswith(','):
        # strip a trailing comma
        val = val[:-1]
    appendage[key] = val


with open("json_log.json", 'w') as file:
    file.write((json.dumps(d, indent=4, sort_keys=False)))

Desired output:

[
    { 
        "name": "(sindey, crosby)",
        "game": "Hockey",
        "type": "athlete"
    },
    { 
        "name": "(wayne, gretzky)",
        "game": "Ice Hockey",
        "type": "athlete"
    }
]

But I'm getting:

   [
 {
  "    name": " (sindey, crosby)"
 },
 {
  "    game": " \"Hockey\""
 },
 {
  "    type": " athlete"
 },
 {
  "    name": " (wayne, gretzky)"
 },
 {
  "    game": " \"Ice Hockey\""
 },
 {
  "    type": " athlete"
 }
]

Any way to fix it to get the desired output and fix the {} around each individual line?

CodePudding user response:

It's usually a good idea to split parsing into simpler tasks, e.g. first parse records, then parse fields.

I'm skipping the file handling and using a text variable:

intxt = """
{ 
    name: (sindey, crosby)
    game: "Hockey"
    type: athlete
},
{ 
    name: (wayne, gretzky)
    game: "Ice Hockey"
    type: athlete
}
"""

Then create a function that can yield all lines that are part of a record:

import json

def parse_records(txt):
    reclines = []
    for line in txt.split('\n'):
        if ':' not in line:
            if reclines:
                yield reclines
                reclines = []
        else:
            reclines.append(line)

and a function that takes those lines and parses each key/value pair:

def parse_fields(reclines):
    res = {}
    for line in reclines:
        key, val = line.strip().rstrip(',').split(':', 1)
        res[key.strip()] = val.strip()
    return res

the main function becomes trivial:

res = []
for rec in parse_records(intxt):
    res.append(parse_fields(rec))

print(json.dumps(res, indent=4))

the output, as desired:

[
    {
        "name": "(sindey, crosby)",
        "game": "\"Hockey\"",
        "type": "athlete"
    },
    {
        "name": "(wayne, gretzky)",
        "game": "\"Ice Hockey\"",
        "type": "athlete"
    }
]

The parsing functions can of course be made better, but you get the idea.

CodePudding user response:

Yes I haven't checked the ouput properly, I remodified the logic now. The output is as expected.

import json

f = open("log.file", "r")
content = f.read()
print(content)
splitcontent = content.splitlines()
d = []
for line in splitcontent:
    if "{" in line:
        appendage = {}
    elif "}" in line:
        d.append(appendage)
    else:
        key, val = line.split(':')
        appendage[key.strip()] = val.strip()

with open("json_log.json", 'w') as file:
    file.write((json.dumps(d, indent=4, sort_keys=False)))
  • Related