Matrix size error when trying to visualize maximum activation of CNN prediction layer in Keras-CodePudding

Inspired by François Chollet's book "Deep Learning with Python" (1rst edition) I'm trying to generate a picture that maximizes a prediction of a VGG16 model.

The original procedure for intermediate layers is described here (from cell 12 on):

https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/first_edition/5.4-visualizing-what-convnets-learn.ipynb

Essentially, this involves a gradient descent for the input image:

import keras, matplotlib.pyplot as plt, numpy as np
from keras import backend as K, models
from keras.applications.vgg16 import decode_predictions, preprocess_input, VGG16
from keras.models import load_model
from keras.preprocessing import image

model = VGG16(weights='imagenet')

layer_name = 'block3_conv1'
filter_index = 0
layer_output = model.get_layer(layer_name).output
loss = K.mean(layer_output[:, :, :, filter_index])
grads = K.gradients(loss, model.input)[0]
grads /= (K.sqrt(K.mean(K.square(grads)))   1e-5)
iterate = K.function([model.input], [loss, grads])
loss_value, grads_value = iterate([np.zeros((1, 150, 150, 3))])

To reproduce this for the final prediction, I considered that the final layer renders a thousand-dimensional vector (corresponding for 1000 classes in the VGG16 case), but where only one index needs to be maximized, say 285 for "cat".

Accordingly, I have modified the code slightly:

layer_pred_name = 'predictions'
pred_index = 285
layer_pred_output = model.get_layer(layer_pred_name).output
loss_pred = K.mean(layer_pred_output[:, pred_index])
grads_pred = K.gradients(loss_pred, model.input)[0]
grads_pred /= (K.sqrt(K.mean(K.square(grads_pred)))   1e-5)
iterate_pred = K.function([model.input], [loss_pred, grads_pred])
loss_pred_value, grads_pred_value = iterate_pred([np.zeros((1, 150, 150, 3))])

However, I then unfortunately get the following error:

InvalidArgumentError: Matrix size-incompatible: In[0]: [1,8192], In[1]: [25088,4096]
     [[{{node fc1/MatMul}} = MatMul[T=DT_FLOAT, _class=["loc:@gradients_1/fc1/MatMul_grad/MatMul"], transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](flatten/Reshape, fc1/kernel/read)]]

Actually, the dimensions seem to fit, which is why I can't understand the error. Any ideas on how to fix this would be appreciated.

CodePudding user response：

Finally I found a workaround for this by writing an own random search function that minimizes the prediction difference to a given prediction:

def prediction_leastquares(input1, input2):
    leastsquare = 0
    for idx in range(len(input1)):
        leastsquare = leastsquare   (input1[idx] - input2[idx])**2
    leastsquare = leastsquare**(1/2)
    return leastsquare

opt_pred = np.zeros(1000)
opt_pred[285] = 1

x2 = np.zeros(x.shape)   100
x2 = np.array(x2)
predsdiff2 = 2
for i in range(10000):
    preds2 = model.predict(x2)
    x1 = x2.copy()
    x1 = x1   np.random.normal(loc=0.0, scale=1, size=[1, x1.shape[1], x1.shape[2], 3])
    preds1 = model.predict(x1)
    predsdiff1 = prediction_leastquares(preds1[0], opt_pred)
    if (predsdiff1 < predsdiff2):
        predsdiff2 = predsdiff1
        x2 = x1.copy()

The final output is a random-looking image which is classified as "cat" with a very high confidence - a hands-on adversarial attack.

Optimized picture classified as cat by VGG16