i am new to python. I'm trying to learn how this K NN algorithm works I tried to apply this code.
from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784', version=1)
print (mnist.data.shape)
print (mnist.target.shape)
import numpy as np
sample = np.random.randint(70000, size=5000)
data = mnist.data[sample]
target = mnist.target[sample]
from sklearn.model_selection import train_test_split
xtrain, xtest, ytrain, ytest = train_test_split(data, target, train_size=0.8)
but it does not work it displays an error
(70000, 784)
(70000,)
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-6-3b6254553355> in <module>
8 import numpy as np
9 sample = np.random.randint(70000, size=5000)
---> 10 data = mnist.data[sample]
11 #target = mnist.target[sample]
12 #from sklearn.model_selection import train_test_split
~\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
3028 if is_iterator(key):
3029 key = list(key)
-> 3030 indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
3031
3032 # take() does not accept boolean indexers
CodePudding user response:
you are indexing pandas dataframe, and you should use .loc or .iloc, as pointed here, not the normal indexing you are used to with numpy arrays, this should work:
data = mnist.data.loc[sample]
target = mnist.target.loc[sample]