I am trying to create a dictionary from two pandas dataframe following is a snapshot the dataframe which suppose to hold the keys:
C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000005.jpg
C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000005.jpg
C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000005.jpg
C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000005.jpg
C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000005.jpg
C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000007.jpg
C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000009.jpg
C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000009.jpg
C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000009.jpg
C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000009.jpg
C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000012.jpg
And the following datarame snapshot is values for the dictionary:
324,339,263,211,9
253,372,165,264,9
67,374,5,244,9
295,299,241,194,9
so I want to append each two rows togather as a key and value in one dictionary This is what I tried:
import pandas as pd
import numpy as np
image_files=pd.read_csv('image_files.csv')
file = pd.read_csv('Training_dataset.csv')
image_anno_dict={}
for image_file, row in zip(image_files,file.iterrows()):
image_anno_dict[image_file]=np.array(row)
my expected output:
{'C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000005.jpg': [324,339,263,211,9]
'C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000005.jpg': [253,372,165,264,9]
'C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000005.jpg': [67,374,5,244,9]
.
.
.
}
But the code work only for the first row, Any suggestion for a solution?
print(image_files.head(5)):
C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000005.jpg
0 C:/Users/Yaman/PycharmProjects/Mindsporeprojec...
1 C:/Users/Yaman/PycharmProjects/Mindsporeprojec...
2 C:/Users/Yaman/PycharmProjects/Mindsporeprojec...
3 C:/Users/Yaman/PycharmProjects/Mindsporeprojec...
4 C:/Users/Yaman/PycharmProjects/Mindsporeprojec...
print(file.head(5)):
0 1 2 3 4
0 324 339 263 211 9
1 253 372 165 264 9
2 67 374 5 244 9
3 295 299 241 194 9
4 312 220 277 186 9
CodePudding user response:
You can use pandas Series to combined two dataframes and then convert it by calling to_dict method. Here is working sample code
import pandas as pd
df1 = pd.DataFrame({'df1Keys':['ab','bc','c','df','efg']})
df2 = pd.DataFrame({'df2Vlues':[1,25,3,84,545]})
#method 1
print(pd.Series(df2.df2Vlues.values,index=df1.df1Keys).to_dict())
#method 2
print(dict(zip(df1.df1Keys,df2.df2Vlues)))
CodePudding user response:
import pandas as pd
import numpy as np
image_files = pd.read_csv('image_files.csv', header=None)
file = pd.read_csv('Training_dataset.csv')
image_anno_list = list(zip(image_files[0], file.apply(np.array, axis=1)))
Output:
>>> image_anno_list
[('C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\\000005.jpg',
array([324, 339, 263, 211, 9])),
('C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\\000005.jpg',
array([253, 372, 165, 264, 9])),
('C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\\000005.jpg',
array([ 67, 374, 5, 244, 9])),
('C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\\000005.jpg',
array([295, 299, 241, 194, 9])),
('C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\\000005.jpg',
array([312, 220, 277, 186, 9]))]
If you use a dict, you will get this:
image_anno_dict = dict(zip(image_files[0], file.apply(np.array, axis=1)))
>>> image_anno_dict
{'C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\\000005.jpg':
array([312, 220, 277, 186, 9])}
CodePudding user response:
You can create dictionary
with collections.defaultdict
with a list
default like below:
from collections import defaultdict
import pandas as pd
import numpy as np
image_files=pd.read_csv('image_files.csv')
file = pd.read_csv('Training_dataset.csv')
image_anno_dict=defaultdict(list)
for image_file, row in zip(image_files,file.iterrows()):
image_anno_dict[image_file].append(np.array(row))
Output:
{'C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000005.jpg' :
[
[324,339,263,211,9], [253,372,165,264,9] , [67,374,5,244,9], ...
]
,
...
,
'C:/Users/Yaman/PycharmProjects/Mindsporeproject/JPEGImages_train\000009.jpg' :
[
[253,372,165,264,9] , [67,374,5,244,9], ...
],
...
}