I have a dataframe with numerous numeric columns that I want to analyze one by one, so I am doing this in a for loop rather than copy/paste the same set of code for each variable. As an example, one of those columns are titled "weight_dry" with values in the 1400 to 1500 (over 180k values in that range). My question is, how would I refer to this parameter so that its value is recognized and used. For example, the 1st two lines of code below does not filter the values of "weight_dry" as the code section below does when use the variable name explicitly. I figure there must be an easy way to do this - I just don't know and can't find a good reference
this doesn't work, even though I have confirmed that the code knows that, for the 1st iteration of j, test_par[j]=weight_dry
all_data2 <- all_data2 %>%
filter(test_par[j] > 0)
whereas, this works just fine:
all_data2 <- all_data2 %>%
filter(weight_dry > 0)
I tried to use eval() and maybe didn't use it right but didn't work for me (so far).
CodePudding user response:
We can just use .data
- recommended way to deal with this kind of issue
all_data2 <- all_data2 %>%
filter(.data[[test_par[j]]] > 0)
or another option is across
with all_of
all_data2 <- all_data2 %>%
filter(across(all_of(test_par[j]), ~ . > 0))
CodePudding user response:
If test_par is a character vector of variable names, we have to convert to symbol before evaluating.
We can use rlang
, convert to sym
bol and evaluate (!!
):
library(dplyr)
library(rlang)
all_data2 <- all_data2 %>%
filter(!!sym(test_par[j]) > 0)