Home > Software design >  Python: Resizing array by removing nth element
Python: Resizing array by removing nth element

Time:06-21

I have some dynamically created arrays that have varying lengths and I would like to resize them to the same 5000 element length by popping every n element.

Here is what I got so far:

import numpy as np
random_array = np.random.rand(26975,3)

n_to_pop = int(len(random_array) / 5000)
print(n)

If I do the downsampling with n (5) I get 5395 elements

I can do 5395 / 5000 = 1.07899, but I don't know how to calculate how often I should pop a element to remove the last 0.07899 elements.

If I can get within 5000-5050 length that would also be acceptable, then the remainder can be sacrificed with a simple .resize

This is probably just a simple math question, but I couldn't seem to find an answer anywhere.

Any help is much appreciated.

Best regards

Martin

CodePudding user response:

You can use Step solution using np.random.choice or np.random.permutation as:

random_array[np.random.permutation(random_array.shape[0])[:5000]]

In case of near uniformly remove the rows, one way is:

indices = np.linspace(0, random_array.shape[0], endpoint=False, num=5000, dtype=int)
# [    0     5    10    16    ...    26958 26964 26969] --> shape = (5000,)

result = random_array[indices]

CodePudding user response:

You can use something like np.linspace to make your solution as uniform as possible:

subset = random_array[np.round(np.linspace(0, len(random_array), 5000, endpoint=False)).astype(int)]

You don't always want to drop a uniform number of elements. Consider the case of reducing a 5003 element array to 5000 elements vs a 50003 element array. The trick is to create a set of elements to keep or drop that's as linear as possible in the index, which is exactly what np.linspace does.

You could also do something like

np.delete(random_array, np.round(np.linspace(0, len(random_array) len(random_array) - 5000, endpoint=False)).astype(int))
  • Related