Making a dictionary from two numpy arrays-CodePudding

I have two NumPy arrays:

one named labels and the other component:

labels = array([ 0,  0,  0,  3,  0,  0,  0,  1,  0,  1,  0,  0,  2,  2,  3,  4, -1])

component = array([ 1.05325312,  1.0206622 ,  1.0372349 ,  1.06951778,  0.96379751, 0.98862576,  1.01135931,  0.92633951,  1.09095756,  1.1662432, 1.17794883,  1.23006966,  1.25465147,  1.27054648,  1.18940802, 0.91512676,  0.81926385])

I want the labels array to be the keys of the dictionary, and the elements in components to be sorted into the keys.

Both arrays are the same shape and the positions of the elements in labels correspond to the position in components.

I'm trying to get something like this:

{'-1': [0.81926385],
'0': [1.05325312, 1.0206622, 1.0372349, 0.96379751, 0.98862576, 1.01135931, 1.09095756], 
'1': [0.92633951, 1.1662432], 
'2': [1.25465147,  1.27054648], 
'3': [1.06951778, 1.18940802], 
'4': [0.91512676]}

I have tried using zip with several different methods but I can't figure out how to split the values into their associated key. Can anyone point me in the right direction?

d = dict(zip(labels, components))

CodePudding user response：

You can't use dict(zip(**)) directly, don't forget that the keys in the dictionary are unique, adding a judgment may solve the problem, the way I provide is to do it by a loop combined with an if statement, if the key exists then append, if not then create an empty list:

from numpy import array

labels = array([ 0,  0,  0,  3,  0,  0,  0,  1,  0,  1,  0,  0,  2,  2,  3,  4, -1])

component = array([ 1.05325312,  1.0206622 ,  1.0372349 ,  1.06951778,  0.96379751, 0.98862576,  1.01135931,  0.92633951,  1.09095756,  1.1662432, 1.17794883,  1.23006966,  1.25465147,  1.27054648,  1.18940802, 0.91512676,  0.81926385])

dic = {}
for key, value in zip(labels, component):
    if key not in dic:
        dic[key] = [value]
    else:
        dic[key].append(value)
print(dic)

If you want to be more concise, you can consider using defaultdict as well as OrderedDict

CodePudding user response：

Another alternative I have thought of is to use pandas.

Put both arrays into their own columns and group them by labels.

I was hoping to keep things in numpy for speed but if needs must.

pd.DataFrame(data=[X.flatten(), labels], index={'Levels', 'Zone'}).T

CodePudding user response：

This is a perfect use case for collections.defaultdict:

from collections import defaultdict

d = defaultdict(list)

for l,c in zip(labels, component):
    d[l].append(c)
    
d = dict(d)

output:

{0: [1.05325312, 1.0206622, 1.0372349, 0.96379751, 0.98862576, 1.01135931, 1.09095756, 1.17794883, 1.23006966],
 3: [1.06951778, 1.18940802],
 1: [0.92633951, 1.1662432],
 2: [1.25465147, 1.27054648],
 4: [0.91512676],
-1: [0.81926385]}