Creating a vector in R including characters and operators-CodePudding

I would like to assign a vector in R to write less code and ensure there are no mistakes in the code. I want to exclude observations "obs 1", "obs 4", "obs 9" etc. using the subset()-function. However, I want to create a vector of these observations to use in the subset()-function instead.

Example - how I would like it to be:

excluded <- column1!= "obs 1" &  column1!= "obs 4" & column1!= "obs 9"
dataframe <- subset(dataframe, excluded)

Example - What works and what I want to avoid

excluded <- column1!= "obs 1" &  column1!= "obs 4" & column1!= "obs 9"
dataframe <- subset(dataframe, column1!= "obs 1" &  column1!= "obs 4" & column1!= "obs 9")

I have tried both c(), list(), the combination of them both, and column1 <- "column1".

Thank you in advance!

Update with data set example.

set.seed(42) 
n <- 12
dataframe <- data.frame(column1=as.character(factor(paste("obs",1:n))),rand=rep(LETTERS[1:2], n/2), x=rnorm(n))
dataframe

#output -first 5 rows:

   column1  rand      x
1    obs 1    A  1.37096
2    obs 2    B -0.56470
3    obs 3    A  0.36313
4    obs 4    B  0.63286
5    obs 5    A  0.40427

CodePudding user response：

# load package
library(data.table)

# set as datatable
setDT(dataframe)

# put exclusion criteria into vector
y <- c("obs 1", "obs 4", "obs 9")

# subset
dataframe[!column1 %in% y]

CodePudding user response：

You should specify a logical vector and then using it to subset. The %in% operator avoids repetition.

excluded <- !(dataframe$column1 %in% c("obs 1", "obs 4", "obs 9"))
# [1] FALSE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE

subset(dataframe, excluded)
#    column1 rand           x
# 2    obs 2    B -0.56469817
# 3    obs 3    A  0.36312841
# 5    obs 5    A  0.40426832
# 6    obs 6    B -0.10612452
# 7    obs 7    A  1.51152200
# 8    obs 8    B -0.09465904
# 10  obs 10    B -0.06271410
# 11  obs 11    A  1.30486965
# 12  obs 12    B  2.28664539