I have a labeled dataset:
data = np.array([5.2, 4, 5, 2, 5.3, 10, 0])
labels = np.array([1, 0, 1, 2, 1, 3, 4])
I want to pick the data 5.2, 5 and 5.3
with the label 1 and reproduce it, like followed:
datalabel1 = data[(labels == 1)]
Then I want to do a random.choice()
, for example (pseudo):
# indices are the indices from label 1
random_choices = np.random.choice(indices, size = 5)
And get as output different values with different indices:
# indices are the different indices of the data from the pool out of random choice
data: [5.3 5.2 5.2 5.2 5]
indices: [4 0 0 2 2]
My goal is to pick out of a pool with label 1 data.
CodePudding user response:
labels == 1
is a boolean mask. You nee to apply it to data
, not back to labels
to get the data elements labeled 1:
np.random.choice(data[labels == 1], ...)
You can also convert labels == 1
to a set of indices and choose on those before indexing:
indices = np.flatnonzero(labels == 1)
data[np.random.choice(indices, ...)]