I need to convert the text file to JSON with this format
'annotations': [{u'image_id': 0, u'caption': u'the man is playing a guitar'},
{u'image_id': 0, u'caption': u'a man is playing a guitar'},
{u'image_id': 1, u'caption': u'a woman is slicing cucumbers'},
{u'image_id': 1, u'caption': u'the woman is slicing cucumbers'},
{u'image_id': 1, u'caption': u'a woman is cutting cucumbers'}]
}
text file as
image_id 42 caption man is sitting on bench with his head
image_id 73 caption man is riding motorcycle on the street
image_id 74 caption cat laying on top of bed next to window
the code is
import json
images = []
with open('1.txt') as f:
for line in f:
_, image_id, _, caption = line.split(maxsplit=3)
images.append({"image_id": int(image_id), "caption": caption})
with open('r.json', "w") as f:
json.dump(images, f)
but got in the result file
[{"image_id": 42, "caption": "man is holding an umbrella in the rain\n"}, {"image_id": 73, "caption": "black and white cat sitting on top of car\n"},....]
as the problem when i tried to read the result file
imgToAnnsRES = {ann['image_id']: [] for ann in datasetRES['annotations']}
TypeError: list indices must be integers or slices, not str
CodePudding user response:
Assuming you got the initial dict:
images = [{u'image_id': 0, u'caption': u'the man is playing a guitar'},
{u'image_id': 1, u'caption': u'a man is playing a guitar'},
{u'image_id': 2, u'caption': u'a woman is slicing cucumbers'},
{u'image_id': 3, u'caption': u'the woman is slicing cucumbers'},
{u'image_id': 4, u'caption': u'a woman is cutting cucumbers'}]
We can simply define the datasetRES
object as :
datasetRES = {'annotations': images}
Now you can use the following code:
imgToAnnsRES = {ann['image_id']: [] for ann in datasetRES['annotations']}