Dropping partially overlapping parts of 2 vectors in R-CodePudding

I wonder if it might be possible to drop the parts in n1 character vector that partially overlap with elements in f1 formula.

For example, in n1, we see "chyes"&"bmi:chyes" partially overlap with ch in f1.

Thus in the desired_output, we want to drop the "ch" part of them. Because other elements in n1 either fully overlap with an element in f1 (ex. bmi) or don't exist in f1 (ex. intrcpt), we leave them unchanged.

I have tried the following solution, but can't get my desired output.

Is obtaining my desired_output possible in BASE R or tidyvesrse?

f1 <- yi~ bmi*ch

n1 <- c("intrcpt","bmi","chyes","bmi:chyes")

desired_output <- c("intrcpt","bmi","yes","bmi:yes")

### Current unsuccessful solution:
foo <- function(fmla, vec) {
  
  v1 <- all.vars(fmla)
  v2 <- setdiff(vec, v1)
  v1 <- paste0('^', v1)
  v3 <- sub(paste(v1, collapse = "|"), "", v2)
  vec[vec %in% v2] <- v3
  vec 
}
### EXAMPLE OF USE:
foo(f1, n1)
# "intrcpt"   "bmi"       "chyes"     "bmi:chyes"

CodePudding user response：

This function does what you want, but I agree with @Onyambu that it is worth considering whether your underlying problem actually necessitates string manipulation.

f <- function(fm, nm) {
  vars <- vapply(attr(terms(fm), "variables"), deparse, "")[-1L]
  subpat <- paste0(gsub("([()])", "\\\\\\1", vars), collapse = "|")
  l <- rapply(strsplit(nm, ":"), sub, how = "list",
              pattern = sprintf("^(%s)(. )$", subpat), replacement = "\\2")
  vapply(l, paste0, "", collapse = ":")
}

fm1 <- yi ~ bmi * ch
nm1 <- c("intrcpt", "bmi", "chyes", "bmi:chyes")
f(fm1, nm1)

[1] "intrcpt" "bmi"     "yes"     "bmi:yes"

fm2 <- yi ~ bmi * factor(ch)
nm2 <- c("intrcpt", "bmi", "factor(ch)yes", "bmi:factor(ch)yes")
f(fm2, nm2)

[1] "intrcpt" "bmi"     "yes"     "bmi:yes"