Eliminating partially overlapping parts of 2 vectors in R-CodePudding

I wonder if it might be possible to drop the parts in n1 character vector that partially overlap with elements in f1 formula.

For example, in n1, we see "study_typecompare" & "study_typecontrol" partially overlap with study_type in f1.

Thus in the desired_output, we want to drop the "study_type" part of them. Because other elements (ex. time_wk_whn) in n1 fully overlap with an element in f1, we leave them unchanged.

Is obtaining my desired_output possible in BASE R or tidyvesrse?

f1 <- gi ~ 0   study_type   time_wk_whn   time_wk_btw   items_whn   
  items_btw   training_hr_whn   training_hr_btw

n1 <- c("study_typecompare","study_typecontrol","time_wk_whn",      
        "time_wk_btw","items_whn","items_btw","training_hr_whn",
        "training_hr_btw") 

desired_output <- c("compare","control", "time_wk_whn",      
                    "time_wk_btw","items_whn","items_btw",        
                    "training_hr_whn","training_hr_btw")

CodePudding user response：

We create a function to pass the formula and the vector ('fmla', 'vec') respectively. Extract the variables from the 'fmla' (all.vars), find the values in the vector that are not found in the formula variables (setdiff), create a pattern by paste those variables and replace with blank ("") using sub, and update the 'vec', return the updated vector

fun1 <- function(fmla, vec) {

    v1 <- all.vars(fmla)
    v2 <- setdiff(vec, v1)
    v3 <- sub(paste(v1, collapse = "|"), "", v2)
    vec[vec %in% v2] <- v3
    vec

}

-checking

> identical(fun1(f1, n1), desired_output)
[1] TRUE