R, algorithm to replace 0 with the surrounding numbers-CodePudding

I am currently trying to set an efficient algorithm to replace 0 values with the surrounding numbers if similar in R. Here is a replication of my data:

ID <- c("FR01", "FR02", "FR03", "FR04")
String <- c("0000001000100100100100220002000200020011", "0222000000001000010101110020020002002022", "0000000000001000010101110020020002002022", "2002220002200202010002222222222222222222")
df <- data.frame(ID, String)
#Results:
result<-df %>% mutate(String=c("1111111111111111111100222222222222220011","2222000000001111111111110022222222222222","1111111111111111111111110022222222222222","2222222222222222010002222222222222222222"))

Id	String
FR01	0000001000100100100100220002000200020011
FR02	0222000000001000010101110020020002002022
FR03	0000000000001000010101110020020002002022
FR04	2002220002200202010002222222222222222222

Condition to replace , for both ways:

if adjacent number is 0 check next number
if both adjacent numbers are the same replace by this number
if adjacent number are different keep 0 except for the start and the end of file where only 1 adjacent number is needed

Results needed

Id	String
FR01	1111111111111111111100222222222222220011
FR02	2222000000001111111111110022222222222222
FR03	1111111111111111111111110022222222222222
FR04	2222222222222222010002222222222222222222

Anyone knows how to efficiently build algorithm to change those string numbers?

Thanks you for your help

CodePudding user response：

Here is something quick:

foo = \(x) {
  y  = unlist(strsplit(x, ""))
  ny = length(y)
  z  = gregexpr("0 ", x)[[1L]]
  if (z[1L] == -1L) return(x)
  for (i in seq_along(z)) {
    ml = attr(z, "match.length")[i]
    if      (i == 1L && ml < ny)       y[1L:ml]          = y[ml 1L]
    else if (z[i] ml > ny)             y[(ny-ml 1L):ny]  = y[ny-ml]
    else if (y[z[i]-1L] == y[z[i] ml]) y[z[i]:(z[i] ml)] = y[z[i] ml]
  }
  paste(y, collapse = "")
}

Example

df = data.frame(
  ID     = c("FR01", "FR02", "FR03"),
  String = c(
    "0000001000100100100100220002000200020010", 
    "0222000000001000010101110020020002002022", 
    "0000000000001000010101110020020002002022"
  )
)

df$result = sapply(df$String, foo)

#     ID                                   String                                   result
# 1 FR01 0000001000100100100100220002000200020010 1111111111111111111100222222222222220011
# 2 FR02 0222000000001000010101110020020002002022 2222000000001111111111110022222222222222
# 3 FR03 0000000000001000010101110020020002002022 1111111111111111111111110022222222222222