Home > Software engineering >  How to do 2D Convolution only at a specific location?
How to do 2D Convolution only at a specific location?

Time:07-08

This question has been asked multiple times but still I could not get what I was looking for. Imagine

data=np.random.rand(N,N)   #shape N x N
kernel=np.random.rand(3,3) #shape M x M

I know convolution typically means placing the kernel all over the data. But in my case N and M are of the orders of 10000. So I wish to get the value of the convolution at a specific location in the data, say at (10,37) without doing unnecessary calculations at all locations. So the output will be just a number. The main goal is to reduce the computation and memory expenses. Is there any inbuilt function that does this with minimal adjustments?

CodePudding user response:

Indeed, applying the convolution for a particular position coincides with the mere sum over the entries of a (pointwise) multiplication of the submatrix in data and the flipped kernel itself. Here, is a reproducible example.

Code

N = 1000
M = 3

np.random.seed(777)
data  = np.random.rand(N,N)   #shape N x N
kernel= np.random.rand(M,M)   #shape M x M

# Pointwise convolution = pointwise product
data[10:10 M,37:37 M]*kernel[::-1, ::-1]
>array([[0.70980514, 0.37426475, 0.02392947],
       [0.24387766, 0.1985901 , 0.01103323],
       [0.06321042, 0.57352696, 0.25606805]])

with output

conv = np.sum(data[10:10 M,37:37 M]*kernel[::-1, ::-1])
conv
>2.45430578

The kernel is being flipped by definition of the convolution as explained in here and was kindly pointed Warren Weckesser. Thanks!

The key is to make sense of the index you provided. I assumed it refers to the upper left corner of the sub-matrix in data. However, it can refer to the midpoint as well when M is odd.

Concept

A different example with N=7 and M=3 exemplifies the idea and is presented in here for the kernel

kernel = np.array([[3,0,-1], [2,0,1], [4,4,3]])

which, when flipped, yields

k[::-1,::-1]
> array([[ 3,  4,  4],
         [ 1,  0,  2],
         [-1,  0,  3]])

EDIT 1:

Please note that the lecturer in this video does not explicitly mention that flipping the kernel is required before the pointwise multiplication to adhere to the mathematically proper definition of convolution.

EDIT 2:

For large M and target index close to the boundary of data, a ValueError: operands could not be broadcast together with shapes ... might be thrown. To prevent this, padding the matrix data with zeros can prevent this (although it blows up the memory requirement). I.e.

data   = np.pad(data, pad_width=M, mode='constant')
  • Related