R - Carry forward 1 until another 1 reached-CodePudding

I have a vector of zero and ones v <- c(0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0). What I want is a function func do is to carry forward 1 until the next 1 is reached starting from the left, and then stop. If there is a new 1 after that, then the process should begin repeat. If a subsequent 1 is not found, 1 will be carried forward until the end of the vector. So here:

v is c(0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0)

func(v) should give c(0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0)

Any help appreciated.

CodePudding user response：

This sounds like a run-length encoding problem:

func <- function(z) {
  r <- rle(z)
  ind <- which(r$values == 0)
  r$values[ ind[seq_along(ind) %% 2 == !(r$values[1] == 0)] ] <- 1
  inverse.rle(r)
}
func(vec)
#  [1] 0 1 1 1 1 1 0 0 1 1 1 0

Walk-through:

First, a run-length encoding gives each unique value and the consecutive repeats for it, so
```
rle(v)
# Run Length Encoding
#   lengths: int [1:9] 1 1 3 1 2 1 1 1 1
#   values : num [1:9] 0 1 0 1 0 1 0 1 0
```
Interpretation: the first value is a 0 and it has 1 instance; the second is a 1 (with 1); the third is a 0 with 3 instances; .etc.
From there, knowing that we can call inverse.rle(r) to regenerate the original v, we need to replace alternating 0s with 1: to know which we need to replace depends on the first value: if the first is a 0, then we need to replace the even instances of 0 within r$values.

seq_along(ind) %% 2 determines if a specific ind (indices of 0) is an even or an odd index; !(r$values[1] == 0) gives us the correct odd/even component we want.
Once we know which (evens or odds) need to shift from 0 to 1, we just reassign it, and see
```
r
# Run Length Encoding
#   lengths: int [1:9] 1 1 3 1 2 1 1 1 1
#   values : num [1:9] 0 1 1 1 0 1 1 1 0
```
It may be clear from here that if we invert it back into a vector, we should get one 0, one 1, three 1, one 1, two 0, etc, exactly what we want.

CodePudding user response：

def fill_holes_with_ones(arr): 
    found, idx = False, -1 
     
    for i, el in enumerate(arr): 
        if found and el == 1: 
            for j in range(idx, i): 
                arr[j] = 1 
            found = False 
        elif el == 1: 
            idx = i 
            found = True 
    return arr

The above code is in python but since this is primarily an algorithmic question and python is easy enough to read I think you can translate it with relative ease.

This algorithm runs in O(n) time complexity (maximum 2 pass) and O(1) space.

The idea is to have a flag to indicate if you have already found an opening 1 or not. Also have an index that keeps track of the last found opening 1. If you have already found an opening 1 and see another one, set all elements between the index that you previously captured and the current index to 1 and set the flag to False to invalidate the previous opening. Otherwise, if you find a 1 it only means that you have found an opening 1, set the flag to true and capture the current index for potential later use.

CodePudding user response：

A solution based on converting the numeric vector to string and back to numeric vector:

library(tidyverse)

v <- c(0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0)
z <- paste(v,collapse = "")

z %>% 
  str_extract_all("(10 1)") %>% 
  unlist() %>% 
  reduce(
    function(x,y) str_replace(
      x,
      paste0("(?<!1)",y), 
      paste(rep("1", nchar(y)), collapse = "")),
    .init = z) %>% 
  str_split("") %>% 
  unlist() %>% 
  as.numeric()
#>  [1] 0 1 1 1 1 1 0 1 1 1 0