I am working with the R programming language.
As a learning exercise, I generated a dataset of 20 random points from a Normal Distribution, created the Maximum Likelihood Function corresponding to these 20 points, and then tried to optimize this function to find out the mean (mu) and the standard deviation (sigma).
First, I generated the random data:
y <- rnorm(20,5,5)
Then, I defined the maximum likelihood function:
my_function <- function(x) {
n = 20
a = -n/2*log(2*pi)
b = -n/2*log(x[2]^2)
c = -1/(2*x[2]^2)
d = sum ((y-x[1])^2)
return( a b c* d)
}
Then, I tried to optimize this function three different ways - but all of these ways failed:
Method 1:
#load all libraries
library(pracma)
library(stats)
library(GA)
steep_descent(c(1, 1), my_function, maxiter = 10000)
$xmin
[1] NA
$fmin
[1] NA
$niter
[1] 10001
Warning message:
In steep_descent(c(1, 1), my_function, maxiter = 10000) :
Maximum number of iterations reached -- not converged.
Method 2:
optim(c(1,1), my_function)
$par
[1] -8.487500e 00 1.776357e-16
$value
[1] -5.559631e 34
$counts
function gradient
501 NA
$convergence
[1] 1
$message
NULL
Method 3:
# I redefined the function for the GA Library
my_function <- function(x_1, x_2) {
n = 20
a = -n/2*log(2*pi)
b = -n/2*log(x_2^2)
c = -1/(2*x_2^2)
d = sum ((y-x_1)^2)
return( a b c* d)
}
GA <- ga(type = "real-valued",
fitness = function(x) my_function(x[1], x[2]),
lower = c(-20,20), upper = c(-20,20),
popSize = 50, maxiter = 1000, run = 100)
GA@solution
x1 x2
[1,] -20 20
In all 3 methods, the right correct answer is never found - as a matter of fact, none of the 3 methods find anything close to the correct answer.
As per statistical theory, the correct answer (based on the randomly generate data for this question) should be:
> mu = mean(y)
> sigma = sd(y)
> mu
[1] 3.937513
> sigma
[1] 4.707227
Can someone please show me what I am doing wrong and how can I fix this? (i.e. have the optimization methods return answers that are closer to the correct answers)
Thanks!
CodePudding user response:
Both optim
and steep_descent
perform minimization (you want maximization). In GA
your parameters are fixed (ex. lower = upper = 20). To adapt your code:
my_function <- function(param, vec) {
- length(vec)/2 * log(param[[2]]) - 1/(2 * param[[2]]) * sum((vec - param[[1]])^2)
}
optim
res1 <- optim(
method = "L-BFGS-B",
par = c(0, 1),
fn = my_function,
vec = y,
lower = c(-20, 0.01),
upper = c(20, 40),
control = list(fnscale = -1)
)
negative value for fnscale
will perform maximization.
GA
res2 <- GA::ga(
type = "real-valued",
popSize = 100,
fitnes = my_function,
lower = c(-20, 0),
upper = c(20, 40),
vec = y,
maxiter = 1000,
run = 100,
optim = TRUE
)
It's good to put optim=TRUE
for the use of local search method (package supports it only for real-valued problems)
DEoptim
You can also use differential evolution (it also does minimization):
res3 <- DEoptim::DEoptim(
fn = function(param, vec) -my_function(param, vec),
lower = c(-20, 0),
upper = c(20, 40),
vec = y,
control=list(NP = 100, itermax = 1000))
)
results:
> res1$par
[1] 1.935102 18.683264
> res2@solution[1,]
x1 x2
1.935101 18.683222
> res3$optim$bestmem
par1 par2
1.935101 18.683252
>
(for comparable results seed is needed)