How to apply function to each block of a numpy array in python-CodePudding

I have an n x m array and a function 'switch(A,J)' that takes array (A) and integer(J) input and outputs an array of size n x m. I wish to split my n x m array into arrays of dimension c x c and apply the function with a fixed J to each c x c array and output the resulting array.c may not be a factor of n or m. Would anyone know how to execute this please.

I have tried np.block to split the array and apply to each individual block but then i had trouble reconstructing the matrix. I also attempted to use the slice indexing and store the values in a new array but the issue is my function outputs complex values so these are all discarded when i try and append the new array,

CodePudding user response：

import numpy as np
array = np.array([[1, 2, 3, 1], [4, 5, 6, 4], [7, 8, 9, 7], [11, 22, 33, 44]])

def somefunc(some_array, some_integer):
    return some_array*3
# say that your blocks needs to be 2X2
for i in range(array.shape[0]):
    for j in range(array.shape[1]):
        array[i*2:(i 1)*2, j*2:(j 1)*2] = somefunc(array[i*2:(i 1)*2, j*2:(j 1)*2], 3)

CodePudding user response：

A rather technical way to do is, but really efficient, is to use stride_tricks, which provides "views" of your array.

c=4
N=12 # Note that c is a factor of N

# Just an example array
A=np.arange(1,N 1).reshape(1,-1) np.arange(100,100*(N 1),100).reshape(-1,1)


stR,stC=A.strides
View = np.lib.stride_tricks.as_strided(A, (N//c,N//c,c,c), (c*stR, c*stC,stR,stC))

# You can now vectorize your function to work on such data

def switch(X,J):
    return X.T-J # Just an example

switchv=np.vectorize(switch, signature='(c,c),()->(c,c)')

# And now switchv can be called on your data
J=1
result=switchv(View,J)

Explanation. A here is

array([[ 101,  102,  103,  104,  105,  106,  107,  108,  109,  110,  111,  112],
       [ 201,  202,  203,  204,  205,  206,  207,  208,  209,  210,  211,  212],
       [ 301,  302,  303,  304,  305,  306,  307,  308,  309,  310,  311,  312],
       [ 401,  402,  403,  404,  405,  406,  407,  408,  409,  410,  411,  412],
       [ 501,  502,  503,  504,  505,  506,  507,  508,  509,  510,  511,  512],
       [ 601,  602,  603,  604,  605,  606,  607,  608,  609,  610,  611,  612],
       [ 701,  702,  703,  704,  705,  706,  707,  708,  709,  710,  711,  712],
       [ 801,  802,  803,  804,  805,  806,  807,  808,  809,  810,  811,  812],
       [ 901,  902,  903,  904,  905,  906,  907,  908,  909,  910,  911,  912],
       [1001, 1002, 1003, 1004, 1005, 1006, 1007, 1008, 1009, 1010, 1011, 1012],
       [1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109, 1110, 1111, 1112],
       [1201, 1202, 1203, 1204, 1205, 1206, 1207, 1208, 1209, 1210, 1211, 1212]])

A.strides gives the number of bytes separating each level of the array. So here, each lines and then each elements in the lines. In my example A.strides is (96,8), because there are 8 bytes between two consecutive number (we don't really need to bother about bytes. we'll just copy those strides), and 96 between 2 lines (since there are 12 elements per lines, that is not surprising, but again, we don't really care. There could have been some gap between the lines, but that is not our problem).

np.lib.stride_tricks.as_strided give a new view of A, with a different shape, and a different way to switch from one level to another. It can even lead to some value being repeated. Note that it is just a view. No array is created here. It is a virtual array.

Here, we say that, instead of a NxN array, we want a (N/c)x(N/c)xcxc array. So a (N/c)x(N/c) arrays of blocks, block being cxc arrays of elements. And each we provide a stride of (c*stR, c*stC,stR,stC). Reading that from right to left, that means that inside a row of a block, there are the same amount of bytes than between 2 elements of a line of A. So we will have c succesive elements of A. stR means likewise that between two rows of a block, there is the same gap than between 2 rows of A. So we will also have c subsequent (incomplete but subsequent) rows of A. Then c*stC means that two subsequent blocks are separated by c numbers. And c*stR likewise.

So result is

array([[[[ 101,  102,  103,  104],
         [ 201,  202,  203,  204],
         [ 301,  302,  303,  304],
         [ 401,  402,  403,  404]],

        [[ 105,  106,  107,  108],
         [ 205,  206,  207,  208],
         [ 305,  306,  307,  308],
         [ 405,  406,  407,  408]],

        [[ 109,  110,  111,  112],
         [ 209,  210,  211,  212],
         [ 309,  310,  311,  312],
         [ 409,  410,  411,  412]]],


       [[[ 501,  502,  503,  504],
         [ 601,  602,  603,  604],
         [ 701,  702,  703,  704],
         [ 801,  802,  803,  804]],

        [[ 505,  506,  507,  508],
         [ 605,  606,  607,  608],
         [ 705,  706,  707,  708],
         [ 805,  806,  807,  808]],

        [[ 509,  510,  511,  512],
         [ 609,  610,  611,  612],
         [ 709,  710,  711,  712],
         [ 809,  810,  811,  812]]],


       [[[ 901,  902,  903,  904],
         [1001, 1002, 1003, 1004],
         [1101, 1102, 1103, 1104],
         [1201, 1202, 1203, 1204]],

        [[ 905,  906,  907,  908],
         [1005, 1006, 1007, 1008],
         [1105, 1106, 1107, 1108],
         [1205, 1206, 1207, 1208]],

        [[ 909,  910,  911,  912],
         [1009, 1010, 1011, 1012],
         [1109, 1110, 1111, 1112],
         [1209, 1210, 1211, 1212]]]])

Same array. But viewed as a 3x3 arrays of 4x4 numbers.

Again, no copy, no array were created here. It is just another organisation.

And here come the second technicality: vectorize. vectorize is a way to tell numpy to call a function on each elements of an array.

For a basic example

def f(a,b):
   return a*b

fv=np.vectorize(f)

fv is a function that can work on array of all shapes.

fv(np.arange(10), 2) returns array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])

So it sees that the first argument is an array, iterates it, calls f on each of them, and produce an array of the results.

Exactly as * already does on numpy arrays.

Adding a signature stop iterations at a certain level: since I said that signature of switch was (c,c),()->(c,c) if I call the vectorized version with a (N/c,N/c,c,c) array, it ill iterates the (c,c) subarrays of it, but won't descend through all (c,) subarrays of them, nor all elements, because the atom here is (c,c) arrays.

Result is

array([[[[ 100,  200,  300,  400],
         [ 101,  201,  301,  401],
         [ 102,  202,  302,  402],
         [ 103,  203,  303,  403]],

        [[ 104,  204,  304,  404],
         [ 105,  205,  305,  405],
         [ 106,  206,  306,  406],
         [ 107,  207,  307,  407]],

        [[ 108,  208,  308,  408],
         [ 109,  209,  309,  409],
         [ 110,  210,  310,  410],
         [ 111,  211,  311,  411]]],


       [[[ 500,  600,  700,  800],
         [ 501,  601,  701,  801],
         [ 502,  602,  702,  802],
         [ 503,  603,  703,  803]],

        [[ 504,  604,  704,  804],
         [ 505,  605,  705,  805],
         [ 506,  606,  706,  806],
         [ 507,  607,  707,  807]],

        [[ 508,  608,  708,  808],
         [ 509,  609,  709,  809],
         [ 510,  610,  710,  810],
         [ 511,  611,  711,  811]]],


       [[[ 900, 1000, 1100, 1200],
         [ 901, 1001, 1101, 1201],
         [ 902, 1002, 1102, 1202],
         [ 903, 1003, 1103, 1203]],

        [[ 904, 1004, 1104, 1204],
         [ 905, 1005, 1105, 1205],
         [ 906, 1006, 1106, 1206],
         [ 907, 1007, 1107, 1207]],

        [[ 908, 1008, 1108, 1208],
         [ 909, 1009, 1109, 1209],
         [ 910, 1010, 1110, 1210],
         [ 911, 1011, 1111, 1211]]]])

So as you can see, all an array of 3x3 4x4 blocks, whose all blocks are the blocks from A, transposed, and subtracted with 1, as switch does

For example the second block on the first line of A is

View[0,1]
array([[105, 106, 107, 108],
       [205, 206, 207, 208],
       [305, 306, 307, 308],
       [405, 406, 407, 408]])

Which is consistent with A (numbers starting with 1,2,3,4 are indeed on the 1st, 2nd, 3rd and 4th lines, that is the 1st row of blocks. And numbers ending with 05,06,07,08, are in the 5th, 6th,7th, 8th column, that is the second column of blocks)

And the second block of the 1st line of blocks of the result is

result[0,1]
array([[104, 204, 304, 404],
       [105, 205, 305, 405],
       [106, 206, 306, 406],
       [107, 207, 307, 407]])

Which is indeed, with a difference of J=1 the same thing, transposed.

If you wish to recreate a 12x12 arrays with all the subresults, well, there you can use np.block

np.block([[result[i,j] for j in range(N//c)] for i in range(N//c)])

returns

array([[ 100,  200,  300,  400,  104,  204,  304,  404,  108,  208,  308,  408],
       [ 101,  201,  301,  401,  105,  205,  305,  405,  109,  209,  309,  409],
       [ 102,  202,  302,  402,  106,  206,  306,  406,  110,  210,  310,  410],
       [ 103,  203,  303,  403,  107,  207,  307,  407,  111,  211,  311,  411],
       [ 500,  600,  700,  800,  504,  604,  704,  804,  508,  608,  708,  808],
       [ 501,  601,  701,  801,  505,  605,  705,  805,  509,  609,  709,  809],
       [ 502,  602,  702,  802,  506,  606,  706,  806,  510,  610,  710,  810],
       [ 503,  603,  703,  803,  507,  607,  707,  807,  511,  611,  711,  811],
       [ 900, 1000, 1100, 1200,  904, 1004, 1104, 1204,  908, 1008, 1108, 1208],
       [ 901, 1001, 1101, 1201,  905, 1005, 1105, 1205,  909, 1009, 1109, 1209],
       [ 902, 1002, 1102, 1202,  906, 1006, 1106, 1206,  910, 1010, 1110, 1210],
       [ 903, 1003, 1103, 1203,  907, 1007, 1107, 1207,  911, 1011, 1111, 1211]])