I'm struggling to build a JSON from a CSV file. My CSV file looks as follows:
Shoot ID,Photo ID,Photo Name,Category,X1,Y1,X2,Y2
224,942,dsc_0001.jpg,0,3672,1271,3956,1417
224,942,dsc_0001.jpg,0,352,1401,497,1551
224,942,dsc_0001.jpg,0,181,1581,322,1690
224,943,dsc_0002.jpg,0,3073,1031,3351,1231
224,943,dsc_0002.jpg,0,3626,1811,3765,1901
224,943,dsc_0002.jpg,0,4784,1830,4900,1967
224,943,dsc_0002.jpg,0,1769,1714,1953,1872
224,943,dsc_0002.jpg,0,3173,1755,3305,1854
224,945,dsc_0004.jpg,0,1512,2012,1948,2304
224,945,dsc_0004.jpg,0,1488,1823,1766,2007
224,946,dsc_0005.jpg,0,3843,1812,4134,2029
I need to convert this to the JSON which would look like follows:
[{
"image_id": 942,
"file_name": "dsc_0001",
"height": 2880,
"width": 5760,
"annotations": [{
"bbox": [3672.0, 1271.0, 3956.0, 1417.0],
"bbox_mode": 1,
"category_id": 0
}, {
"bbox": [352.0, 1401.0, 497.0, 1551.0],
"bbox_mode": 1,
"category_id": 0
}, {
"bbox": [181.0, 1581.0, 322.0, 1690.0],
"bbox_mode": 1,
"category_id": 1
}
]
}, {
"image_id": 943,
"file_name": "dsc_0002",
"height": 2880,
"width": 5760,
"annotations": [{
"bbox": [3073.0, 1031.0, 3351.0, 1231.0],
"bbox_mode": 1,
"category_id": 0
}, {
"bbox": [3626.0, 1811.0, 3765.0, 1901.0],
"bbox_mode": 1,
"category_id": 0
}, {
"bbox": [4784.0,1830.0, 4900.0, 1967.0],
"bbox_mode": 1,
"category_id": 0
}, {
"bbox": [1769.0, 1714.0, 1953.0, 1872.0],
"bbox_mode": 1,
"category_id": 0
}, {
"bbox": [3173.0, 1755.0, 3305.0, 1854.0],
"bbox_mode": 1,
"category_id": 0
}
]
}, {
"image_id": 945,
"file_name": "dsc_0004",
"height": 2880,
"width": 5760,
"annotations": [{
"bbox": [1512.0, 2012.0, 1948.0, 2304.0],
"bbox_mode": 1,
"category_id": 0
}, {
"bbox": [1488.0, 1823.0, 1766.0, 2007.0],
"bbox_mode": 1,
"category_id": 0
}
]
}, {
"image_id": 946,
"file_name": "dsc_0005",
"height": 2880,
"width": 5760,
"annotations": [{
"bbox": [3843.0, 1812.0, 4134.0, 2029.0],
"bbox_mode": 1,
"category_id": 0
}
]
}
]
I have tried examples from the following post How to convert csv to json in python? However, I struggled to add several "bbox" elements to the annotations list for the same image. Can anyone help me with this?
CodePudding user response:
You could loop trough each photo and then annotation. The below solution assumes that a picture ID can only have one name and that height, width and bbox_mode are static (as they were not provided in the data sample). I put your data into a csv file called "photo_csv_to_dict.csv" and loaded this with pandas.
import pandas as pd
import numpy as np
df = pd.read_csv('photo_csv_to_dict.csv')
# get id name set (this assumes photo ID could only have one unique name)
photo_id_name = df[['Photo ID', 'Photo Name']].drop_duplicates().reset_index(drop=True)
# convert numeric measures to a numpy matrix
measures = np.array(df[['Photo ID', 'Category', 'X1', 'Y1', 'X2', 'Y2']])
# intiate list for results
result_list = []
# loop through each photo
for i in range(len(photo_id_name)):
# get photo ID from id name set
photo_id = photo_id_name['Photo ID'][i]
# get photo Name from id name set
photo_name = photo_id_name['Photo Name'][i]
# filter measures for specific ID name and only return those
filtered_measures = measures[measures[:,0 ] == photo_id][:,1:]
# intiate list for picture annotations
annotations_list = []
# loop through each row of measures
for row in filtered_measures:
# create a dictionary for an annotation
annotations_dict = {"bbox": list(row[1:]),
"bbox_mode": 1,
"category_id": row[0]
}
# add annotation dictionary to a list
annotations_list.append(annotations_dict)
# create a dictionary for a photo with added id, name a list of annotations
result_dict = {"image_id": photo_id,
"file_name": photo_name,
"height": 2880,
"width": 5760,
"annotations": annotations_list}
# add picture results to a list
result_list.append(result_dict)