I was wondering what the code below does.
data=mydata[,names(mydata) %in% variables$variable]
CodePudding user response:
It subsets the columns of 'mydata' that are also found in the variable column in 'variables'. It can be also written as
mydata[intersect(names(mydata), variables$variable)]
Or with dplyr
library(dplyr)
mydata %>%
select(any_of(variables$variable))
CodePudding user response:
names(mydata) %in% variables$variable
returns a boolean vector of TRUE/FALSE depending on if column names of "mydata" dataset exists in the defined vector variables$variable
. Hence, with data=mydata[,names(mydata) %in% variables$variable]
"data" will have all observations (rows) of "mydata" with all columns in variables$variable