I have a quick recoding question. Here is my sample dataset looks like:
df <- data.frame(id = c(1,2,3),
i1 = c(1,NA,0),
i2 = c(1,1,1))
> df
id i1 i2
1 1 1 1
2 2 NA 1
3 3 0 1
When, i1==NA
, then I need to recode i2==NA
. I tried below but not luck.
df %>%
mutate(i2 = case_when(
i1 == NA ~ NA_real_,
TRUE ~ as.character(i2)))
Error in `mutate()`:
! Problem while computing `i2 = case_when(i1 == "NA" ~ NA_real_, TRUE ~ as.character(i2))`.
Caused by error in `` names(message) <- `*vtmp*` ``:
! 'names' attribute [1] must be the same length as the vector [0]
my desired output looks like this:
> df
id i1 i2
1 1 1 1
2 2 NA NA
3 3 0 1
CodePudding user response:
Here is an option:
t(apply(df, 1, \(x) if (any(is.na(x))) cumsum(x) else x))
# id i1 i2
#[1,] 1 1 1
#[2,] 2 NA NA
#[3,] 3 0 1
The idea is to calculate the cumulative sum of every row, if a row contains an NA
; if there is an NA
in term i , subsequent terms i 1 will also be NA
(since e.g. NA 1 = NA
). Since your sample data df
is all numeric, I recommend using a matrix
(rather than a data.frame
). Matrix operations are usually faster than data.frame
(i.e. list
) operations.
Key assumptions:
id
cannot beNA
.- This replaces
NA
s ini2
based on anNA
ini1
per row.
A tidyverse
solution
I advise against a tidyverse
solution here for a couple of reasons
- Your data is all-numerical, so a
matrix
is a more suitable data structure than adata.frame
/tibble
. dplyr
/tidyr
syntax usually operates efficiently on columns; as soon as you want to do things "row-wise",dplyr
(and its family packages) might not be the best way (despitedplyr::rowwise()
which just introduces a row number-based grouping).
With that out of the way, you can transpose
the problem.
library(tidyverse)
df %>%
transpose() %>%
map(~ { if (is.na(.x$i1)) .x$i2 <- NA_real_; .x }) %>%
transpose() %>%
as_tibble() %>%
unnest(everything())
## A tibble: 3 × 3
# id i1 i2
# <dbl> <dbl> <dbl>
#1 1 1 1
#2 2 NA NA
#3 3 0 1
CodePudding user response:
Would a simple assignment meet your requirements for this?
df$i2[is.na(df$i1)] <- NA