append ith value to a list using index that matches another ndarray index-CodePudding

I am trying to subsample the cifar100 dataset to train and test on one subclass from each superclass. I have it set up so that if a value in y_full (the subclass label for each image) matches my list of subclasses that I want, the index of that element is used to grab a value from X_full (the images) with the same index.

This is my code so far:

from sklearn.model_selection import train_test_split
cifar100 = keras.datasets.cifar100
(X_full, y_full), (X_test_full, y_test_full) = cifar100.load_data(label_mode="fine")

classes = [0,1,2,3,4,5,6,8,9,12,15,22,23,26,27,34,36,41,47,54]

X_tr_full = []
y_tr_full = []
X_test = []
y_test = []

for i in y_full:
  if i in classes:
    X_tr_full.append(X_full[np.where(y_full==i)])
    y_tr_full.append(i)

for i in y_test_full:
  if i in classes:
    X_test.append(X_test_full[np.where(y_test_full==i)])
    y_test.append(i)

The problem with my code is in the np.where(y_full==i). This sends back a tuple of ALL of the indices in y_full that have a value that matches a class in my list, which then adds ALL images from X_full with those indices into one entry. Instead I want to iterate through the entirety of y_full, if the class label matches my class list, I want the index of that element to be used to append the value from X_full with that same index for every value in y_full. Sorry if I'm not clear enough, it's hard to explain what I'm trying to do, but hopefully someone can help!

CodePudding user response：

I think I got it figured out. It was pretty simple once I figured out how to call each index separate from each other:

for n in range(y_full.size):
  if y_full[n] in classes:
    X_tr_full.append(X_full[n])
for i in y_full:
  if i in classes:
    y_tr_full.append(i)

for n in range(y_test_full.size):
  if y_test_full[n] in classes:
    X_test.append(X_test_full[n])
for i in y_test_full:
  if i in classes:
    y_test.append(i)

CodePudding user response：

To illustrate my comment, I'll use a simple example of modulus testing

In [224]: arr = np.arange(10); alist = []
In [225]: for i in [2,3]:
     ...:     alist.append(arr[arr%i>0])
     ...:     
In [226]: alist
Out[226]: [array([1, 3, 5, 7, 9]), array([1, 2, 4, 5, 7, 8])]

I get a list of arrays, which can be joined into one array with:

In [227]: np.hstack(alist)
Out[227]: array([1, 3, 5, 7, 9, 1, 2, 4, 5, 7, 8])

Alternatively with extend:

In [228]: arr = np.arange(10); alist = []    
In [229]: for i in [2,3]:
     ...:     alist.extend(arr[arr%i>0])
     ...:         
In [230]: alist
Out[230]: [1, 3, 5, 7, 9, 1, 2, 4, 5, 7, 8]    
In [231]: np.array(alist)
Out[231]: array([1, 3, 5, 7, 9, 1, 2, 4, 5, 7, 8])

extend replaces your iterative append.