How to write a function that returns indexes of input vector according to some rules, using repeat,-CodePudding

I'm trying to write a function that takes a numeric vector as input, and returns the indexes of a shorter version of the input vector, according to some rules:

(a) if all elements are identical, return just the index of the first element; i.e., return 1; else:

if NOT all elements identical, then test for whether special_treatment_value is among them:
- (b) if special_treatment_value is there, return the input vector's indexes except for the indexes of elements where special_treatment_value appeared; else:
- (c) if special_treatment_value is not there, return the indexes of the input vector as-is, i.e., 1:length(x).

The problem: if we ended up in route (b), we might encounter a situation in which all vector elements are now the same. In such case, we would like to iterate through (a) again to minimize to just the first element.

Example

Let's say that I want to pass the following vectors through my function:

my_vec_1 <- c(1, 2, 1, 2, 3)
my_vec_2 <- c(4, 4, 4)
my_vec_3 <- c(1, 2, 1, 4, 1)
my_vec_4 <- c(3, 3, 3, 4)

and that:

special_treatment_value <- 4

According to my rules, the function should return the outputs:

for my_vec_1: it fits route (c) and thus the output should be 1:5 (indexes of all)
for my_vec_2: it fits route (a) and thus the output should be 1 (index of first)
for my_vec_3: it fits route (b). output should be 1 2 3 5 (indexes of all except for special value's)
my_vec_4 demonstrates the problem. My desired output is 1 because first we go through route (b) then I want to pass through (a). But right now it doesn't happen and my function (see below) returns 1 2 3 (indexes of all except for special value's).

my current attempt

get_indexes <- function(x, special_val) {
  if (var(x) == 0) { # route (a)
    output_idx <- 1
    return(output_idx)
  }
  
  idx_entire_length <- 1:length(x)
  
  if (any(x == special_val)) { # route (b)
    idx_to_remove <- which(x == special_val)
    output_idx    <- idx_entire_length[-idx_to_remove]
    return(output_idx)
  }
  
  # else
  output_idx <- idx_entire_length # route (c)
  return(output_idx)
}

get_indexes(my_vec_1, 4)
#> [1] 1 2 3 4 5
get_indexes(my_vec_2, 4)
#> [1] 1
get_indexes(my_vec_3, 4)
#> [1] 1 2 3 5
get_indexes(my_vec_4, 4)
#> [1] 1 2 3

I guess there should be some repeat block or while loop, but I can't figure out how to implement it correctly (and efficiently).

CodePudding user response：

You could repeat the condition for going through (a) inside condition (b), for example:

f <- function(x, treatment){
  if(var(x) == 0) 1 else {
    if(treatment %in% x) {
      x[-which(x == treatment)] |>
        (\(.) if(var(.) == 0) 1 else (1:length(x))[-which(x == treatment)])()
    } else {
      1:length(x)
    }
  }
}

lapply(list(v1, v2, v3, v4), f, 4)

[[1]]
[1] 1 2 3 4 5

[[2]]
[1] 1

[[3]]
[1] 1 2 3 5

[[4]]
[1] 1

CodePudding user response：

You can try

foo <- function(x, y){
 
  tmp <- which(x != 4) 
  
 if(dplyr::n_distinct(x[x!=4])<=1){
   tmp <- 1
 }
  
 return(tmp) 
}

Instead n_distinct() you can use length(unique())

Result:

lapply(list(my_vec_1, my_vec_2, my_vec_3, my_vec_4), foo, 4)
[[1]]
[1] 1 2 3 4 5

[[2]]
[1] 1

[[3]]
[1] 1 2 3 5

[[4]]
[1] 1