I am working with the R programming language. I am trying to optimize a function that can accept numerical and factor inputs.
For the optimization, I use GA library.
My references: demo, actual Library, specific function I'm using
Suppose I have a function that looks like this:
my_function <- function(r1, r2) {
#define function here, e.g:
#this "select" can be done using "dplyr" or SQL part1 <- SELECT * FROM
my_data WHERE (col_1 IN r1) AND (col_2 > r2)
part2<- mean(part1$col_3)
}
In this example:
r1
can take anygroup
of values ofa, b, c, d
(factor variable),
e.g.r1 = a
,r1 = a,d
,r1 = b,c,a
,r1 = c
,r1 = a,b,c,d
etc.r2
can take a single value between 1 and 100 (numeric variable)my_data
is a dataset that has 3 columns:col_1
(factor, can only take valuesa, b, c, d
),col_2
(numeric),col_3
(numeric)my_data
will be "subsetted" according tor1
andr2
- the
mean
ofcol_3
is the value thatmy_function
will return given a choice ofr1
andr2
- the
mean
ofcol_3
will be the value that I am trying to optimize for a choice ofr1
andr2
Problem: Currently, I am trying to optimize my_function
using the ga
function in R:
library(GA)
GA <- ga(type = "real-valued",
fitness = function(x) my_function(x[1], x[2]),
lower = c(c("a", "b", "c", "d"), 1), upper = c(c("a", "b", "c", "d"), 100),
popSize = 50, maxiter = 1000, run = 100)
But I am not sure how to set this up correctly.
I am not sure how to correctly define my_function
and I am not sure how to correctly define GA.
CodePudding user response:
I think you are looking for something like this:
library("dplyr")
df <- data.frame(a = rep(letters[1:3], each=2),
b = rep(c(1,9), 3),
c = 1:6)
df
#> a b c
#> 1 a 1 1
#> 2 a 9 2
#> 3 b 1 3
#> 4 b 9 4
#> 5 c 1 5
#> 6 c 9 6
my_subset_mean <- function(r1, r2){ ## Assumes an object `df` with cols a|b|c
subset <- df %>% filter(a %in% r1, b > r2)
return(mean(subset$c))
}
my_subset_mean(r1 = c("a"), r2 = 5) ## ~mean(2)
#> [1] 2
my_subset_mean(r1 = c("a", "b"), r2 = 0) ## ~mean(1:4)
#> [1] 2.5
my_subset_mean(r1 = c("a", "b"), r2 = 10) ## ~mean of df with 0 rows
#> [1] NaN
Created on 2021-09-25 by the reprex package (v2.0.0)