I have an image name, image path and it's corresponding cnn feature vector (a numpy array). And there are many images, meaning I have a list of names, paths, vectors (each being numpy array of dimension (1, 2048, 14, 14) and float32 as data type). I have to store and access in an efficient way. I was thinking of going with storing it in a csv file. When I did that, I am unable to convert back the numpy array from str. Kindly let me know how to proceed with this, thank you.
data = pd.DataFrame(columns=['name', 'path', 'vector'])
data['name'] = image_names
data['path'] = image_paths
for i in range(len(image_paths)):
out = encoder(image_paths[i] #(1,2048,14,14)
data['vector'][i] = out
data.to_csv('encoding.csv', sep = ',', na_rep='Unknown')
dr =pd.read_csv('encoding.csv')
for i in range(dr.shape[0]):
c = dr['vector'][i]
c is a str object. Unsure how to convert this to numpy array.
CodePudding user response:
You could save the numpy array using numpy.save
as shown in the docs here. You could save the image name and image path as a .json
file (and just make the name of the .npy
file the same as the name as the .json
file to allow for easier searching).