When I use plt.show the plot only shows the PCA lines and not a scatterplot of the first 2 iris features
import numpy as np
import matplotlib.pylab as plt
from sklearn import decomposition
x = np.load("iris_features.npy")[:, :2]
y = np.load("iris_labels.npy")
idx = np.where(y != 0)
x = x[idx]
x[:, 0] -= x[:, 0].mean()
x[:, 1] -= x[:, 1].mean()
pca = decomposition.PCA(n_components=2)
pca.fit(x)
v = pca.explained_variance_ratio_
plt.scatter(x[:, 0], x[:, 1], marker='o', color='b')
ax = plt.axes()
x0 = v[0] * pca.components_[0, 0]
y0 = v[0] * pca.components_[0, 1]
ax.arrow(0, 0, x0, y0, head_width=0.05,) head_length=0.1, fc='r', ec='r')
x1 = v[1] * pca.components_[1, 0]
y1 = v[1] * pca.components_[1, 1]
ax.arrow(0, 0, x1, y1, head_width=0.05, head_length=0.1, fc='r', ec='r')
plt.xlabel("$x_0$", fontsize=16)
plt.ylabel("$x_1$", fontsize=16)
plt.show()
What the correct plot should look like
CodePudding user response:
Your code seems to work fine using the iris dataset from sklearn, and produces the expected result. You can try that, or you can share your dataset in a testable way, as in principle you could have whatever data in these csv files.
import numpy as np
import matplotlib.pylab as plt
from sklearn import decomposition
from sklearn import datasets
iris = datasets.load_iris()
x = iris.data
y = iris.target
idx = np.where(y != 0)
x = x[idx]
.... # your code continues here
If that doesn't work, share or directly update your python, matplotlib version.