I have a dataframe which looks like this example (just much larger):
Name <- c('Peter','Peter','Peter', 'Ben','Ben','Ben','Mary', 'Mary', 'Mary')
var1 <- c(0.4, 0.6, 0.7, 0.3, 0.9, 0.2, 0.4, 0.6, 0.7)
var2 <- c(0.5, 0.4, 0.2, 0.5, 0.4, 0.2, 0.1, 0.4, 0.2)
var3 <- c(0.2, 0.6, 0.9, 0.5, 0.5, 0.2, 0.5, 0.5, 0.2)
df <- data.frame(Name, var1, var2, var3)
df
I split my dataframe in order to apply a function to every group.
list_split= split(df[,2:4],df$Name)
my_list=vector("list",3)
for (i in seq_along(list_split)){
my_list[[i]]=list(
lapply(list_split[[i]],function(x) summary(x)))
}
After that I wrote a function so that if the mean of the values in 'my_list' is larger than 0.9, the difference of the values in 'split_list' is taken, and otherwise just the value. (Please ignore that the operation does not make any sense, my original function is very different.):
l <- list()
fun <- function(x,y) {ifelse(mean(x) > 0.9,diff(y),y)}
for (j in seq_along(list_split)){
for (i in seq_along(my_list)){
u <- mapply(fun,my_list[[i]][[1]],list_split[[j]], SIMPLIFY = FALSE)
l[[j]] <- u
}
}
I want that the function is applied to all values of the 'var's in the dataframes in 'list_split'.
For example for list_split[["Ben"]]
the values are:
var1 var2 var3
4 0.3 0.5 0.5
5 0.9 0.4 0.5
6 0.2 0.2 0.2
But it is just applied to the first value of every 'var', so that the resulting list for the first element looks like this:
l[[1]]
$var1
[1] 0.3
$var2
[1] 0.5
$var3
[1] 0.5
So how can I apply the function to all values in every 'list_split' element and end up with a list that exactly preserves the structure of 'list_split', that is a list of dataframes?
Thank you!
CodePudding user response:
We could try
Map(\(x, y) {
x[] <- Map(\(u, v) if(mean(v) > 0.9) c(NA, diff(u)) else u, x, y)
x
}, list_split, lapply(my_list, \(x) do.call("c", x)))
-output
$Ben
var1 var2 var3
4 0.3 0.5 0.5
5 0.9 0.4 0.5
6 0.2 0.2 0.2
$Mary
var1 var2 var3
7 0.4 0.1 0.5
8 0.6 0.4 0.5
9 0.7 0.2 0.2
$Peter
var1 var2 var3
1 0.4 0.5 0.2
2 0.6 0.4 0.6
3 0.7 0.2 0.9
CodePudding user response:
Here's one approach:
l <- as.list(names(list_split))
fun <- function(x,y) {ifelse(x > 0.9, y-x, y)}
for (j in seq_along(list_split)){
df2 <- df2[0,]
df2 <- data.frame(matrix(ncol = 3, nrow = 3))
names(df2) <- c("var1", "var2", "var3")
for (i in seq_along(list_split[[j]])){
for (h in seq_along(list_split[[j]][[i]])){
u <- mapply(fun,my_list[[j]][[1]][[i]][[4]],list_split[[j]][[i]][[h]], SIMPLIFY = FALSE)
df2[[i]][[h]] <- u
}
}
l[[j]] <- df2
}
names(l) <- names(list_split)
l
This gives:
$Ben
var1 var2 var3
4 0.3 0.5 0.5
5 0.9 0.4 0.5
6 0.2 0.2 0.2
$Mary
var1 var2 var3
7 0.4 0.1 0.5
8 0.6 0.4 0.5
9 0.7 0.2 0.2
$Peter
var1 var2 var3
1 0.4 0.5 0.2
2 0.6 0.4 0.6
3 0.7 0.2 0.9