Home > database >  python problem ---> 10 data = mnist.data[sample]
python problem ---> 10 data = mnist.data[sample]

Time:12-18

i am new to python. I'm trying to learn how this K NN algorithm works I tried to apply this code.

from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784', version=1)

print (mnist.data.shape)


print (mnist.target.shape)
import numpy as np
sample = np.random.randint(70000, size=5000)
data = mnist.data[sample]
target = mnist.target[sample]
from sklearn.model_selection import train_test_split

xtrain, xtest, ytrain, ytest = train_test_split(data, target, train_size=0.8)

but it does not work it displays an error

(70000, 784)
(70000,)
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-6-3b6254553355> in <module>
      8 import numpy as np
      9 sample = np.random.randint(70000, size=5000)
---> 10 data = mnist.data[sample]
     11 #target = mnist.target[sample]
     12 #from sklearn.model_selection import train_test_split

~\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   3028             if is_iterator(key):
   3029                 key = list(key)
-> 3030             indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
   3031 
   3032         # take() does not accept boolean indexers

CodePudding user response:

you are indexing pandas dataframe, and you should use .loc or .iloc, as pointed here, not the normal indexing you are used to with numpy arrays, this should work:

data = mnist.data.loc[sample]
target = mnist.target.loc[sample]
  • Related