Home > Blockchain >  R replace last nth value with NA by group
R replace last nth value with NA by group

Time:06-10

I want to replace value(s) with NA by group.

have <- data.frame(id = c(1,1,1,1,2,2,2),
                   value = c(1,2,3,4,5,6,7))

want1 <- data.frame(id = c(1,1,1,1,2,2,2),
                    value = c(1,2,3,NA,5,6,NA))

want2 <- data.frame(id = c(1,1,1,1,2,2,2),
                    value = c(1,2,NA,NA,5,NA,NA))

want1 corresponds to replacing the last obs of value with NA and want2 corresponds to replacing last obs of value & last 2nd value with NA. I'm currently trying to do with with dplyr package but can't seem to get any traction. Any help would be much appreciated. Thanks!

CodePudding user response:

We can use row_number() to test the current row against n() the total rows in the group.

have |>
  group_by(id) |>
  mutate(
    last1 = ifelse(row_number() == n(), NA, value),
    last2 = ifelse(row_number() >= n() - 1, NA, value)
  )
# # A tibble: 7 × 4
# # Groups:   id [2]
#      id value last1 last2
#   <dbl> <dbl> <dbl> <dbl>
# 1     1     1     1     1
# 2     1     2     2     2
# 3     1     3     3    NA
# 4     1     4    NA    NA
# 5     2     5     5     5
# 6     2     6     6    NA
# 7     2     7    NA    NA

CodePudding user response:

And a general way to provide variants as different data frames.

lapply(
  1:2,
  function(k) {
    have %>% 
      group_by(id) %>% 
      mutate(value=ifelse(row_number() <= (n() - k), value, NA))
  }
)
[[1]]
# A tibble: 7 × 2
# Groups:   id [2]
     id value
  <dbl> <dbl>
1     1     1
2     1     2
3     1     3
4     1    NA
5     2     5
6     2     6
7     2    NA

[[2]]
# A tibble: 7 × 2
# Groups:   id [2]
     id value
  <dbl> <dbl>
1     1     1
2     1     2
3     1    NA
4     1    NA
5     2     5
6     2    NA
7     2    NA

CodePudding user response:

Here is a base R way.

have <- data.frame(id = c(1,1,1,1,2,2,2),
                   value = c(1,2,3,4,5,6,7))

want1 <- data.frame(id = c(1,1,1,1,2,2,2),
                    value = c(1,2,3,NA,5,6,NA))

want2 <- data.frame(id = c(1,1,1,1,2,2,2),
                    value = c(1,2,NA,NA,5,NA,NA))

with(have, ave(value, id, FUN = \(x){
  x[length(x)] <- NA
  x
}))
#> [1]  1  2  3 NA  5  6 NA
with(have, ave(value, id, FUN = \(x){
  x[length(x)] <- NA
  if(length(x) > 1)
    x[length(x) - 1L] <- NA
  x
}))
#> [1]  1  2 NA NA  5 NA NA

Created on 2022-06-09 by the reprex package (v2.0.1)

Then reassign these results to column value.

  • Related