I'm trying to create a dictionary dico which contains elements like this:
<key> <values>
comb_element pageId1, pageId2, .. pageIdn
comb_element are elements of list of tuple comb :
I try with this code to create dico, but I didn't get the correct format of dico
comb=list(set(product(data['OsId'], data['BrowserId'])))
dico={}
dico[tuple(comb)] = list(data['PageId'])
Here what contains comb :
[(12, 16),
(33, 11),
(99, 11),
(11, 14),
(33, 14),
(32, 12),
(99, 14),
(11, 11),
(11, 17),
(32, 15),
(99, 17),
(33, 17),
(11, 99),
(33, 99),
(99, 99),
(12, 12),
(12, 15),
(32, 11),
(11, 16),
(33, 16),
(32, 14),
(99, 16),
(32, 17),
(32, 99),
(12, 11),
(12, 14),
(12, 17),
(12, 99),
(11, 12),
(99, 15),
(33, 12),
(99, 12),
(11, 15),
(32, 16),
(33, 15)]
Here the dataframe data: in case of (12, 16) as key I would like to attribute the lis of page : 1005581 and 1016529.
Cluster PageId OsId BrowserId
0 1005581 11 16
0 1016529 11 16
0 1016529 11 17
0 1016529 12 14
0 1016529 12 16
So for example, any idea to fix it? thanks
CodePudding user response:
I think the code below might do what you want.
Assuming you have the dataframe in list-of-dicts format, you can just build up the entries in the dictionary dico
by appending the 'PageId' value to a list
with the (OsId, BrowserId) tuple
as its key.
This way you don't need to bother with using product()
to get all possible combinations of (OsId, BrowserId), some of which may have no matching PageIds.
df_as_list_of_dicts = [
{'Cluster' : 0, 'PageId' : 1005581, 'OsId' : 11, 'BrowserId' : 16},
{'Cluster' : 0, 'PageId' : 1016529, 'OsId' : 11, 'BrowserId' : 16},
{'Cluster' : 0, 'PageId' : 1016529, 'OsId' : 11, 'BrowserId' : 17},
{'Cluster' : 0, 'PageId' : 1016529, 'OsId' : 12, 'BrowserId' : 14},
{'Cluster' : 0, 'PageId' : 1016529, 'OsId' : 12, 'BrowserId' : 16},
]
dico = {}
for row in df_as_list_of_dicts:
tup = (row['OsId'], row['BrowserId'])
if tup not in dico:
dico[tup] = []
dico[tup].append(row['PageId'])
[print(f"{key} : {value}") for key, value in dico.items()]
Sample output:
(11, 16) : [1005581, 1016529]
(11, 17) : [1016529]
(12, 14) : [1016529]
(12, 16) : [1016529]