Home > Enterprise >  Weighted resampling a numpy array
Weighted resampling a numpy array

Time:11-14

I have a 50 x 4 numpy array and I'd like to repeat the rows to make it a 500 x 4 array. But the catch is, I cannot just repeat the rows along 0th axis. I'd like to have more smaller rows and lesser bigger rows in the expanded array.

The input array has data that looks like this:

[1, 1, 16, 5]
[8, 10, 512, 10]
...
[448, 8192, 409600, 150]

Here, the initial rows are small and the last rows are large. But the scales for each column are different. i.e., 16 might be a very low value for column 2 but a high value for column 1

How can I achieve this using numpy or even lists?

Expected output would be a vector of shape 500 x 4 where each row is taken from the input vector, and repeated for some number of times.

[1, 1, 16, 5]      # repeated thrice
[1, 1, 16, 5]
[1, 1, 16, 5]
[8, 10, 512, 10]   # repeated twice
[8, 10, 512, 10]
...
[448, 8192, 409600, 150]

CodePudding user response:

You can using np.repeat so to repeat an arbitrary number of time a givent value in an array and then use that as an index for the input array (since np.repeat do not work directly on 2D arrays). Here is an example:

# Example of random input
inputArr = np.random.randint(0, 1000, (50, 4))

# Example with [2, 3, ..., 52] repeated lines
counts = np.arange(2, 52)

# Actual computation
outputArr = inputArr[np.repeat(np.arange(inputArr.shape[0]), counts)] 
  • Related