How can I reproduce data in numpy with random.choice?-CodePudding

I have a labeled dataset:

data = np.array([5.2, 4, 5, 2, 5.3, 10, 0])
labels = np.array([1, 0, 1, 2, 1, 3, 4])

I want to pick the data 5.2, 5 and 5.3 with the label 1 and reproduce it, like followed:

datalabel1 = data[(labels == 1)]

Then I want to do a random.choice(), for example (pseudo):

# indices are the indices from label 1
random_choices = np.random.choice(indices, size = 5)

And get as output different values with different indices:

# indices are the different indices of the data from the pool out of random choice
data:    [5.3 5.2 5.2 5.2 5]
indices: [4 0 0 2 2]

My goal is to pick out of a pool with label 1 data.

CodePudding user response：

labels == 1 is a boolean mask. You nee to apply it to data, not back to labels to get the data elements labeled 1:

np.random.choice(data[labels == 1], ...)

You can also convert labels == 1 to a set of indices and choose on those before indexing:

indices = np.flatnonzero(labels == 1)
data[np.random.choice(indices, ...)]