Home > Software design >  Na fill after a specific value
Na fill after a specific value

Time:10-27

I would like to use NA.fill after a 1, but keep the NA’s after -1. Is there a simple solution for this?

Old New
1 1
NA 1
NA 1
NA 1
-1 -1
NA NA
NA NA
1 1
NA 1
NA 1
#example data
dat <- read.table(text = "
Old New
1   1
NA  1
NA  1
NA  1
-1  -1
NA  NA
NA  NA
1   1
NA  1
NA  1", header = TRUE)

CodePudding user response:

You can use a loop

x = c(1,NA,NA,NA,-1,NA,NA,1,NA,NA)
for (i in seq_along(x)[-1]) {
  if (!is.na(x[i-1]) & x[i-1] == 1 & is.na(x[i])) x[i] = 1
}
# [1]  1  1  1  1 -1 NA NA  1  1  1

CodePudding user response:

Here's an approach using rle, give or take a hack.

r <- rle(ifelse(is.na(dat$Old), -Inf, dat$Old))
r$values[is.infinite(r$values)] <- NA_integer_
r
# Run Length Encoding
#   lengths: int [1:6] 1 3 1 2 1 2
#   values : num [1:6] 1 NA -1 NA 1 NA

ind <- is.na(r$values[-1]) & r$values[-length(r$values)] == 1
ind
# [1]  TRUE FALSE FALSE FALSE  TRUE
r$values[c(FALSE, ind)] <- r$values[c(ind, FALSE)]
inverse.rle(r)
#  [1]  1  1  1  1 -1 NA NA  1  1  1

Notes:

  • rle treats all missing values (i.e., NA) as unequal, which defeats out logic here; I work around this by first converting NA to -Inf (somewhat arbitrary, I assume highly unlikely to appear in real data), run the rle, then convert back to NA
  • is.na(r$values[-1]) & r$values[-length(r$values)] == 1 determines if one value is NA and the preceding value is 1;
  • we use that value (as ind) to determine which values to replace (c(F, ind)) and which values to replace them with (c(ind, F));
  • inverse.rle does what it should: regenerates the vector, but now with the 1-following-NA values changed to 1, no other changes

If the logic is instead "fill NA unless previous value is not -1" (in case there are also non-1 values that should be filled) by changing the ind calculation from == 1 to != -1.

CodePudding user response:

With cumsum:

df$Old[as.logical(cumsum(replace(df$Old, is.na(df$Old), 0)))] <- 1

CodePudding user response:

Using data.table:

library(data.table)

setDT(dat)[, x := fifelse(is.na(Old) & head(Old, 1) == 1, head(Old, 1), Old), 
          by = cumsum(!is.na(Old)) ]

df
#     Old New  x
#  1:   1   1  1
#  2:  NA   1  1
#  3:  NA   1  1
#  4:  NA   1  1
#  5:  -1  -1 -1
#  6:  NA  NA NA
#  7:  NA  NA NA
#  8:   1   1  1
#  9:  NA   1  1
# 10:  NA   1  1

CodePudding user response:

You could do this by fill and ifelse

library(tidyverse)
dat <- structure(list(Old = c(1L, NA, NA, NA, -1L, NA, NA, 1L, NA, NA
)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
)) 

dat %>% 
mutate(New = Old) %>% 
fill(New) %>% 
mutate(New = ifelse(New == -1, Old, New)) %>% 
select(Old, New)

Result:

# A tibble: 10 x 2
     Old   New
   <int> <int>
 1     1     1
 2    NA     1
 3    NA     1
 4    NA     1
 5    -1    -1
 6    NA    NA
 7    NA    NA
 8     1     1
 9    NA     1
10    NA     1

I think SO this question could also be helpful.

  • Related