I am quite new to R and I am encountering a problem. I would like to - but can't manage to - loop the following code to repeat the following for multiple columns, ideally specified by name.
bplist<- df %>%
group_by(Date, X, Y, Z) %>%
summarize(bp= quantile(THIS, probs = (1 - var)))
Neither bplist$THIS, nor df$THIS, nor either[x] is working.
Later, I would need it to loop through the columns again with the following idea:
XY <- df%>%
left_join(bp, by = c("A", "B", "C")) %>%
mutate(CDE = case_when(
THIS>= bp1~ "High",
THIS<= bp2~ "Low",
But I can't seem to manage to adress the columns by a variable with their name (as "THIS" would only take a numeric vector inthe quantile function), nor by creating a vector, as this tells me that:
"Can't subset columns past the end.
Loacations [various numbers] don't exist.
There is only one Column"
Is there even any chance to automate this, preferredly using a for loop, as I am familiar with it. It don't think the looping itself will eb the problem, but adressing the columns.
Sincere thanks to anyone taking their time to read and think about this. If I left anything unclear, which I probably have as my knowledge of R is very basic, please ask.
CodePudding user response:
If you need a loop, then I would make a summarizing function and then loop through the variables. I assume you want to use a character vector for the names, so this requires a little magic with dplyr using !!
:
library(tidyverse)
sum_fun <- function(THIS){
df %>%
group_by(Date, X, Y, Z) %>%
summarize(bp= quantile(!!quo(!!sym(THIS)), probs = (1 - var)))
}
vars <- c("A", "B", "C")
bplist <- list()
for(i in seq_along(vars)){
bplist[[i]] <- sum_fun(vars[[i]])
}