Home > OS >  How can i convert text to json file?
How can i convert text to json file?

Time:10-01

I need to create a JSON file with this structure

[{"image_id": 0873, "caption": "clock tower with a clock on top of it"}, {"image_id": 1083, "caption": "two zebras are standing in the grass in the grass"} , .....

from this file which contains

image_id 0873  caption clock tower with a clock on top of it 
image_id 1083  caption two zebras are standing in the grass in the grass 
image_id 1270  caption baseball player is swinging a bat at the ball  
image_id 1436  caption man is sitting on the bed with laptop 

how can I start to do that?

CodePudding user response:

Assuming every line looks like: image_id {image_id} caption {caption} You can use the str method split(maxsplit=number) for splitting the line into the four parts.

line = "image_id 0873  caption clock tower with a clock on top of it"
_, image_id, _, caption = line.split(maxsplit=3)
# Now image_id = "0873", caption = "caption clock tower with a clock on top of it"

For iterating over all the file's lines:

images = []
with open(path) as f:
    for line in f:
        _, image_id, _, caption = line.split(maxsplit=3)
        images.append({"image_id": int(image_id), "caption": caption})

For saving a variable into JSON file, you can use the json module:

import json
with open(path_to_save, "w") as f:
    json.dump(images, f)

CodePudding user response:

This should the trick:

import json

# get your data
file_lines = open("file_with_data.txt").readlines()
json_data = []
for line in file_lines:
    # removing new line char \n
    line = line.replace("\n", "")
    # split words inside line
    splt_line = line.split(" ")
    # bullit single dict from line data
    small_json = {splt_line[0]: splt_line[1], splt_line[3]: " ".join(splt_line[4:]).strip()}
    # add data to your list
    json_data.append(small_json)
# now dump List[Dict] to  .json file
json.dump( json_data, open("json_dump.json", 'w'),)

CodePudding user response:

Try to use regexp - easy import more complicated patterns. Below is extended version of @Kozubi answer:

    import json
    import re
    
    json_data = []
    with open("test.txt") as f:
        pattern = re.compile(r"""image_id\s (?P<image_id>[0-9] )\s 
                                 caption\s (?P<caption>.*)$
                                 """, re.X)
        for line in f.readlines():
            m = pattern.match(line.strip())
            if m:
                json_data.append({
                    "image_id": int(m.group('image_id')),
                    "caption": m.group('caption')
                    })
                
        print(json.dumps(json_data, indent=4))            
        json.dump(json_data, open("json_dump.json", 'w'), indent=4)

CodePudding user response:

Go to https://anyconv.com/txt-to-json-converter/ in a web browser. You can use any web browser to convert TXT to JSON. Click Choose File. It's centered in the page; doing so will bring up your file manager.

  • Related