I want to drop the parts in n1
character vector that partially (not fully) overlap with elements in f1
formula.
For example, in n1
, we see "study_typecompare"
& "study_typecontrol"
partially overlap with study_type
in f1
.
Thus in the desired_output
, we want to drop the "study_type"
part of them. Because other elements (ex. factor(v_gi)
) in n1
fully overlap with an element in f1
or don't exist (ex. intrcpt
) in f1
, we leave them unchanged.
Is obtaining my desired_output
(below) possible in BASE R or tidyvesrse?
I have tried the following, but it errounousely drops v_gi
from within factor(v_gi)
:
f1 <- gi ~ factor(v_gi) study_type
n1 <- c("intrcpt","factor(v_gi)","study_typecompare","study_typecontrol")
fun1 <- function(fmla, vec) {
v1 <- all.vars(fmla)
v2 <- setdiff(vec, v1)
v3 <- sub(paste(v1, collapse = "|"), "", v2)
vec[vec %in% v2] <- v3
vec
}
# EXAMPLE OF USE:
fun1(f1, n1)
# Current Output:
[1] "intrcpt" "factor()" "compare" "control" ## Notice `factor()` has errounousely lost its`v_gi`
desired_output = c("intrcpt","factor(v_gi)","compare","control")
CodePudding user response:
After the setdiff
line, paste
a ^
as prefix to the strings, so in the sub
it only matches the beginnings.
fun1 <- function(fmla, vec) {
v1 <- all.vars(fmla)
v2 <- setdiff(vec, v1)
v1 <- paste0('^', v1)
v3 <- sub(paste(v1, collapse = "|"), "", v2)
vec[vec %in% v2] <- v3
vec
}
fun1(f1, n1)
# [1] "intrcpt" "factor(v_gi)" "compare" "control"