Home > Net >  Follow-up: Eliminating partially overlapping parts of 2 vectors in R
Follow-up: Eliminating partially overlapping parts of 2 vectors in R

Time:12-25

I want to drop the parts in n1 character vector that partially (not fully) overlap with elements in f1 formula.

For example, in n1, we see "study_typecompare" & "study_typecontrol" partially overlap with study_type in f1.

Thus in the desired_output, we want to drop the "study_type" part of them. Because other elements (ex. factor(v_gi)) in n1 fully overlap with an element in f1 or don't exist (ex. intrcpt) in f1, we leave them unchanged.

Is obtaining my desired_output (below) possible in BASE R or tidyvesrse?

I have tried the following, but it errounousely drops v_gi from within factor(v_gi):

f1 <- gi ~ factor(v_gi)   study_type
n1 <- c("intrcpt","factor(v_gi)","study_typecompare","study_typecontrol")


fun1 <- function(fmla, vec) {
  
  v1 <- all.vars(fmla)
  v2 <- setdiff(vec, v1)
  v3 <- sub(paste(v1, collapse = "|"), "", v2)
  vec[vec %in% v2] <- v3
  vec
}
# EXAMPLE OF USE:
fun1(f1, n1)
# Current Output:
[1] "intrcpt"  "factor()" "compare"  "control"  ## Notice `factor()` has errounousely lost its`v_gi`

desired_output = c("intrcpt","factor(v_gi)","compare","control") 

CodePudding user response:

After the setdiff line, paste a ^ as prefix to the strings, so in the sub it only matches the beginnings.

fun1 <- function(fmla, vec) {
  v1 <- all.vars(fmla)
  v2 <- setdiff(vec, v1)
  v1 <- paste0('^', v1)
  v3 <- sub(paste(v1, collapse = "|"), "", v2)
  vec[vec %in% v2] <- v3
  vec
}

fun1(f1, n1)
# [1] "intrcpt"      "factor(v_gi)" "compare"      "control"    
  • Related