Slice dictionary based on Numpy array values-CodePudding

I have one matrix called m_hat of shape k x 2, while k is an arbitrary number.

17.574  17.8316
22.449  23.0995
13.4923 11.8801
8.34949 8.0102
16.676  17.2908
24.8699 25.2985
13.7985 12.8801
13.4541 13.9107
14.7577 14.9133
47.0102 48.4668

I want to make a scatter plot with x axis being the first column of m_hat and y axis being the second column of m_hat. This can be done with this line code.

plt.scatter(m_hat[:, 0], m_hat[:, 1])

Now I want to put colors on the data based on labels. So, I have a dictionary containing the label and the respective colors.

cdict = {1: 'red', 3: 'green', 5: 'blue', 7: 'yellow'}

And the lines to code the scatter plot along with colors are these, YET this is an extremely slow process:

[plt.scatter(m_hat[:, 0], m_hat[:, 1], c=cdict[cl]) for cl in l_train]

With l_train being an array with the respective labels.

I have tried to do something like sclicing the dictionary based on l_train:

[plt.scatter(m_hat[:, 0], m_hat[:, 1], c=cdict[l_train])]

Yet this error occurs:

TypeError: unhashable type: 'numpy.ndarray'

How can I do my job AND avoiding any kind of for loop, that is only with numpy sclicing?

CodePudding user response：

The dict-indexing is not vectorizable with numpy arrays, this is where you have to have an iteration. But you an use a list comprehension for this so you only have one call to plt.scatter which will be a lot faster, so just replace

[plt.scatter(m_hat[:, 0], m_hat[:, 1], c=cdict[cl]) for cl in l_train]

with

plt.scatter(m_hat[:, 0], m_hat[:, 1], c=[cdict[cl] for cl in l_train])