I'm trying to write a function that takes a numeric vector as input, and returns the indexes of a shorter version of the input vector, according to some rules:
(a) if all elements are identical, return just the index of the first element; i.e., return
1
; else:if NOT all elements identical, then test for whether
special_treatment_value
is among them:- (b) if
special_treatment_value
is there, return the input vector's indexes except for the indexes of elements wherespecial_treatment_value
appeared; else: - (c) if
special_treatment_value
is not there, return the indexes of the input vector as-is, i.e.,1:length(x)
.
- (b) if
The problem: if we ended up in route (b), we might encounter a situation in which all vector elements are now the same. In such case, we would like to iterate through (a) again to minimize to just the first element.
Example
Let's say that I want to pass the following vectors through my function:
my_vec_1 <- c(1, 2, 1, 2, 3)
my_vec_2 <- c(4, 4, 4)
my_vec_3 <- c(1, 2, 1, 4, 1)
my_vec_4 <- c(3, 3, 3, 4)
and that:
special_treatment_value <- 4
According to my rules, the function should return the outputs:
- for
my_vec_1
: it fits route (c) and thus the output should be1:5
(indexes of all) - for
my_vec_2
: it fits route (a) and thus the output should be1
(index of first) - for
my_vec_3
: it fits route (b). output should be1 2 3 5
(indexes of all except for special value's) my_vec_4
demonstrates the problem. My desired output is1
because first we go through route (b) then I want to pass through (a). But right now it doesn't happen and my function (see below) returns1 2 3
(indexes of all except for special value's).
my current attempt
get_indexes <- function(x, special_val) {
if (var(x) == 0) { # route (a)
output_idx <- 1
return(output_idx)
}
idx_entire_length <- 1:length(x)
if (any(x == special_val)) { # route (b)
idx_to_remove <- which(x == special_val)
output_idx <- idx_entire_length[-idx_to_remove]
return(output_idx)
}
# else
output_idx <- idx_entire_length # route (c)
return(output_idx)
}
get_indexes(my_vec_1, 4)
#> [1] 1 2 3 4 5
get_indexes(my_vec_2, 4)
#> [1] 1
get_indexes(my_vec_3, 4)
#> [1] 1 2 3 5
get_indexes(my_vec_4, 4)
#> [1] 1 2 3
I guess there should be some repeat
block or while
loop, but I can't figure out how to implement it correctly (and efficiently).
CodePudding user response:
You could repeat the condition for going through (a) inside condition (b), for example:
f <- function(x, treatment){
if(var(x) == 0) 1 else {
if(treatment %in% x) {
x[-which(x == treatment)] |>
(\(.) if(var(.) == 0) 1 else (1:length(x))[-which(x == treatment)])()
} else {
1:length(x)
}
}
}
lapply(list(v1, v2, v3, v4), f, 4)
[[1]]
[1] 1 2 3 4 5
[[2]]
[1] 1
[[3]]
[1] 1 2 3 5
[[4]]
[1] 1
CodePudding user response:
You can try
foo <- function(x, y){
tmp <- which(x != 4)
if(dplyr::n_distinct(x[x!=4])<=1){
tmp <- 1
}
return(tmp)
}
Instead n_distinct()
you can use length(unique())
Result:
lapply(list(my_vec_1, my_vec_2, my_vec_3, my_vec_4), foo, 4)
[[1]]
[1] 1 2 3 4 5
[[2]]
[1] 1
[[3]]
[1] 1 2 3 5
[[4]]
[1] 1