I have one matrix called m_hat of shape k x 2, while k is an arbitrary number.
17.574 17.8316
22.449 23.0995
13.4923 11.8801
8.34949 8.0102
16.676 17.2908
24.8699 25.2985
13.7985 12.8801
13.4541 13.9107
14.7577 14.9133
47.0102 48.4668
I want to make a scatter plot with x axis being the first column of m_hat and y axis being the second column of m_hat. This can be done with this line code.
plt.scatter(m_hat[:, 0], m_hat[:, 1])
Now I want to put colors on the data based on labels. So, I have a dictionary containing the label and the respective colors.
cdict = {1: 'red', 3: 'green', 5: 'blue', 7: 'yellow'}
And the lines to code the scatter plot along with colors are these, YET this is an extremely slow process:
[plt.scatter(m_hat[:, 0], m_hat[:, 1], c=cdict[cl]) for cl in l_train]
With l_train
being an array with the respective labels.
1
1
1
1
1
1
1
1
1
1
I have tried to do something like sclicing the dictionary based on l_train:
[plt.scatter(m_hat[:, 0], m_hat[:, 1], c=cdict[l_train])]
Yet this error occurs:
TypeError: unhashable type: 'numpy.ndarray'
How can I do my job AND avoiding any kind of for loop, that is only with numpy sclicing?
CodePudding user response:
The dict-indexing is not vectorizable with numpy arrays, this is where you have to have an iteration. But you an use a list comprehension for this so you only have one call to plt.scatter
which will be a lot faster, so just replace
[plt.scatter(m_hat[:, 0], m_hat[:, 1], c=cdict[cl]) for cl in l_train]
with
plt.scatter(m_hat[:, 0], m_hat[:, 1], c=[cdict[cl] for cl in l_train])