For help! The depth of the residual shrinkage network fitting problem-CodePudding

Recently looked at the depth of the residual shrinkage network, original papers included (Keras and TFlearn both versions), want to bring Keras framework to Cifar10 run, finally found a fitting is very serious, the training set accuracy is very good, but only more than 60, val - accuracy about Dropout and regularization, really can't find the reason, thank you very much, give advice or comments! Attach the connection of the original paper:

https://blog.csdn.net/Jordanisxu/article/details/105007339? Ops_request_misc=% 25257 b % 252522 request 252522% % 25255 fid % 25253 a 252522% % 252522% % 252522160946653916780277046277% % 252522 SCM 25252 c 25253 a % 25252220140713.130102334 PC % 25255 fall. The 252522% % 25257 d & amp; Request_id=160946653916780277046277 & amp; Biz_id=0 & amp; Utm_medium=distribute. Pc_search_result. None - task - blog - 2 ~ all ~ first_rank_v2 ~ rank_v29-13-105007339. Pc_search_result_no_baidu_js & amp; Utm_term=% E6 A6 BA E5 B1 B7 % % % % % % % E6 AE AE B7 E5 8 b % % % % % E6, E7 B6 % % % 94% BC E7 BD E7 A9 % % % % 91% % BB % 9 c

Here is my code: thank you bosses!

The from __future__ import print_function
The import keras
The import numpy as np
The from keras. Datasets import cifar10
The from keras. The layers import Dense, Conv2D BatchNormalization, Activation
The from keras. The layers import AveragePooling2D, Input, GlobalAveragePooling2D
The from keras. Optimizers import Adam
The from keras. Regularizers import l2
The from keras import backend as K
The from keras. Models import Model
The from keras. The layers. The core import Lambda
The from keras. Callbacks import LearningRateScheduler

Keith et_learning_phase (1)

# Input image dimensions
Img_rows, img_cols=32, 32

# The data, The split between train and test sets
(x_train y_train), (x_test, y_test)=cifar10. Load_data ()

If K.i mage_data_format ()=='channels_first:
X_train=x_train. Reshape (x_train. Shape [0], 3, img_rows, img_cols)
X_test=x_test. Reshape (x_test. Shape [0], 3, img_rows, img_cols)
Input_shape=(3, img_rows, img_cols)
The else:
X_train=x_train. Reshape (x_train. Shape [0], img_rows, img_cols, 3)
X_test=x_test. Reshape (x_test. Shape [0], img_rows, img_cols, 3)
Input_shape=(img_rows img_cols, 3)

# Noised data
X_train=x_train. Astype (' float32)/255 + 0.1 * np in random. The random ([x_train shape [0], img_rows, img_cols, 3])
X_test=x_test. Astype (' float32)/255 + 0.1 * np in random. The random ([x_test shape [0], img_rows, img_cols, 3])
Print (' x_train shape: 'x_train. Shape)
Print (x_train. Shape [0], '" train "samples')
Print (x_test. Shape [0], 'the test samples')

# the convert class vectors to binary class matrices
Y_train=keras. Utils. To_categorical (y_train, 10)
Y_test=keras. Utils. To_categorical (y_test, 10)

Def scheduler (epoch) :
if epoch % 50==0 and epoch !=0:
Lr=K.g et_value (model. The optimizer. Lr)
Keith et_value (model. The optimizer. Lr. Lr * 0.1)
Print (" lr changed to {} ". The format (lr) * 0.1)
Return K.g et_value (model. The optimizer. Lr)

Def abs_backend (inputs) :
Return K.a bs (inputs)

Def expand_dim_backend (inputs) :
Return Kevin Everett sat xpand_dims (Kevin Everett sat xpand_dims (inputs, 1), 1)

Def sign_backend (inputs) :
Return Keith ign (inputs)

Def pad_backend (inputs, in_channels out_channels) :
Pad_dim=(out_channels - in_channels)//2
Inputs=Kevin Everett sat xpand_dims (inputs, 1)
Inputs=Keith patial_3d_padding (inputs, ((0, 0), (0, 0), (pad_dim pad_dim)), 'channels_last')
Return Keith queeze (inputs, 1)

# Residual Shrinakge Block
Def residual_shrinkage_block (incoming, nb_blocks out_channels, downsample=False,
Downsample_strides=2) :
Residual=incoming
In_channels=incoming. Get_shape (.) as_list () [1]

For I in range (nb_blocks) :

Identity=residual

If not downsample:
Downsample_strides=1

Residual=BatchNormalization () (residual)
Residual=Activation (' relu) (residual)
Residual=Conv2D (out_channels, 3, strides=(downsample_strides, downsample_strides),
Padding='same, kernel_initializer=' he_normal ',
E kernel_regularizer=l2 (1-4)) (residual)

Residual=BatchNormalization () (residual)
Residual=Activation (' relu) (residual)
Residual=Conv2D (out_channels, 3, padding='the same', kernel_initializer='he_normal',
E kernel_regularizer=l2 (1-4)) (residual)

# Calculate global means
Residual_abs=Lambda (abs_backend) (residual)
Abs_mean=GlobalAveragePooling2D () (residual_abs)

# Calculate scaling coefficients
Scales=Dense (out_channels, activation=None, kernel_initializer='he_normal',
E kernel_regularizer=l2 (1-4)) (abs_mean)
Scales=BatchNormalization () (scales)
Scales, the Activation of=(' relu) (scales)
Scales=Dense (out_channels, activation='sigmoid', kernel_regularizer='L2') (scales)
Scales=Lambda (expand_dim_backend) (scales)

# Calculate thresholds
Thres=keras. The layers. Multiply ([abs_mean, scales])

# Soft thresholding
Sub=keras. The layers. Subtract ([residual_abs, thres])
Zeros=keras. The layers. Subtract ([sub and sub])
N_sub=keras. The layers. Maximum (/sub, zeros)
nullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnullnull