Home > Blockchain >  Neural network: predictions are just the same prediction despite changes in the model parameters
Neural network: predictions are just the same prediction despite changes in the model parameters

Time:01-05

I'd like to fit a neural network using brulee but despite the several changes in the model parameters (changes in all the parameters), I always have the almost same value in the predictions. In my case:

# Open the data set
data_train_sub <- read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/cc_test_ds.csv")

# Model parameters
hidden_units <-c(4)
epochs <-c(50)
dropout <-c(0.01)
learn_rate <- c(0.01)
activation <- c("relu")
penalty <- c(0.01)
validation <-c(0.80)

# Training data set
data_train <- data_train_sub[1:1250,]

# Validation data set
data_test <-  data_train_sub[1251:1500,]

# Model fitting
fit <- brulee_mlp(x = as.matrix(data_train[, 2:ncol(data_train)]),
               y = data_train$cc,
               hidden_units = hidden_units,
               epochs = epochs, dropout = dropout, learn_rate = learn_rate, activation = activation,
               penalty = penalty,validation=validation)
#Plot                
predict(fit, data_test) %>%
   bind_cols(data_test) %>%
   ggplot(aes(x = .pred, y = cc))  
   geom_abline(col = "green")  
   geom_point(alpha = .3)  
   lims(x = c(0, 1.0), y = c(0, 1.0))  
   coord_fixed(ratio = 1)

nn

This sounds strange to me. I would appreciate any help.

Thanks in advance!

CodePudding user response:

The main issues were the outliers mentioned above and that you needed to standardize your predictors to be on the same scale.

Although the model doesn't fit great, here is a modified version with more complexity (but gives different predicted values). I also added PCA which helps a small amount (but you could leave that step out of the recipe).

library(tidymodels)
library(brulee)

tidymodels_prefer()
theme_set(theme_bw())
options(pillar.advice = FALSE, pillar.min_title_chars = Inf)

data_train_sub <- read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/cc_test_ds.csv")

# Model parameters
hidden_units <-c(50) # more hidden units
epochs <-c(500) # more iterations
dropout <-c(0) # since we are using penalization, no dropout
learn_rate <- c(0.01)
activation <- c("relu")
penalty <- c(0.01)
validation <-c(0.20) # hold out 20%

# Training data set
data_train <- data_train_sub[1:1250,]

# There are two extreme outliers: 
data_train_2 <- data_train %>% slice(-c(64, 162))

# Validation data set
data_test <-  data_train_sub[1251:1500,]

rec <- 
  recipe(cc ~ ., data = data_train_2) %>% 
  step_normalize(all_predictors()) %>% 
  step_pca(all_predictors()) 

set.seed(1)
# Model fitting
fit <- brulee_mlp(rec, data = data_train_2,
                  hidden_units = hidden_units,
                  epochs = epochs, dropout = dropout, learn_rate = learn_rate, activation = activation,
                  penalty = penalty,validation=validation)

# check convergence
autoplot(fit)


#Plot                
predict(fit, data_test) %>%
  bind_cols(data_test) %>%
  ggplot(aes(x = .pred, y = cc))  
  geom_abline(col = "green")  
  geom_point(alpha = .3)  
  lims(x = c(0, 1.0), y = c(0, 1.0))  
  coord_fixed(ratio = 1)

Created on 2023-01-04 by the reprex package (v2.0.1)

  • Related