I would like to use NA.fill after a 1, but keep the NA’s after -1. Is there a simple solution for this?
Old | New |
---|---|
1 | 1 |
NA | 1 |
NA | 1 |
NA | 1 |
-1 | -1 |
NA | NA |
NA | NA |
1 | 1 |
NA | 1 |
NA | 1 |
#example data
dat <- read.table(text = "
Old New
1 1
NA 1
NA 1
NA 1
-1 -1
NA NA
NA NA
1 1
NA 1
NA 1", header = TRUE)
CodePudding user response:
You can use a loop
x = c(1,NA,NA,NA,-1,NA,NA,1,NA,NA)
for (i in seq_along(x)[-1]) {
if (!is.na(x[i-1]) & x[i-1] == 1 & is.na(x[i])) x[i] = 1
}
# [1] 1 1 1 1 -1 NA NA 1 1 1
CodePudding user response:
Here's an approach using rle
, give or take a hack.
r <- rle(ifelse(is.na(dat$Old), -Inf, dat$Old))
r$values[is.infinite(r$values)] <- NA_integer_
r
# Run Length Encoding
# lengths: int [1:6] 1 3 1 2 1 2
# values : num [1:6] 1 NA -1 NA 1 NA
ind <- is.na(r$values[-1]) & r$values[-length(r$values)] == 1
ind
# [1] TRUE FALSE FALSE FALSE TRUE
r$values[c(FALSE, ind)] <- r$values[c(ind, FALSE)]
inverse.rle(r)
# [1] 1 1 1 1 -1 NA NA 1 1 1
Notes:
rle
treats all missing values (i.e.,NA
) as unequal, which defeats out logic here; I work around this by first convertingNA
to-Inf
(somewhat arbitrary, I assume highly unlikely to appear in real data), run therle
, then convert back toNA
is.na(r$values[-1]) & r$values[-length(r$values)] == 1
determines if one value isNA
and the preceding value is1
;- we use that value (as
ind
) to determine which values to replace (c(F, ind)
) and which values to replace them with (c(ind, F)
); inverse.rle
does what it should: regenerates the vector, but now with the 1-following-NA
values changed to1
, no other changes
If the logic is instead "fill NA unless previous value is not -1" (in case there are also non-1
values that should be filled) by changing the ind
calculation from == 1
to != -1
.
CodePudding user response:
With cumsum:
df$Old[as.logical(cumsum(replace(df$Old, is.na(df$Old), 0)))] <- 1
CodePudding user response:
Using data.table:
library(data.table)
setDT(dat)[, x := fifelse(is.na(Old) & head(Old, 1) == 1, head(Old, 1), Old),
by = cumsum(!is.na(Old)) ]
df
# Old New x
# 1: 1 1 1
# 2: NA 1 1
# 3: NA 1 1
# 4: NA 1 1
# 5: -1 -1 -1
# 6: NA NA NA
# 7: NA NA NA
# 8: 1 1 1
# 9: NA 1 1
# 10: NA 1 1
CodePudding user response:
You could do this by fill
and ifelse
library(tidyverse)
dat <- structure(list(Old = c(1L, NA, NA, NA, -1L, NA, NA, 1L, NA, NA
)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
))
dat %>%
mutate(New = Old) %>%
fill(New) %>%
mutate(New = ifelse(New == -1, Old, New)) %>%
select(Old, New)
Result:
# A tibble: 10 x 2
Old New
<int> <int>
1 1 1
2 NA 1
3 NA 1
4 NA 1
5 -1 -1
6 NA NA
7 NA NA
8 1 1
9 NA 1
10 NA 1
I think SO this question could also be helpful.