This is the data:
a = c(1,0,0,NA,0,1)
b = c(0,1,0,NA,NA,0)
c = c(0,1,0,NA,NA,NA)
cbind(a,b,c) -> df
I want to generate a variable named x
. It needs the following requirements:
- As long as there is one ‘1’ in the three lines,
x
is ‘1’; otherwisex
is ‘0’. - Only when all three lines are missing and does not have a ‘1’,
x
is returned as a missing value,NA
.
df
a b c x
[1,] 1 0 0 1
[2,] 0 1 1 1
[3,] 0 0 0 0
[4,] NA NA NA NA
[5,] 0 NA NA NA
[6,] 1 0 NA 1
CodePudding user response:
To have vectorized code use logical vectors.
a = c(1,0,0,NA,0,1)
b = c(0,1,0,NA,NA,0)
c = c(0,1,0,NA,NA,NA)
cbind(a,b,c) -> df
ones <- rowSums(df == 1, na.rm = TRUE)
x <- ones > 0
is.na(x) <- rowSums(is.na(df)) > 0 & ones == 0
rm(ones)
cbind(df, x)
#> a b c x
#> [1,] 1 0 0 1
#> [2,] 0 1 1 1
#> [3,] 0 0 0 0
#> [4,] NA NA NA NA
#> [5,] 0 NA NA NA
#> [6,] 1 0 NA 1
Created on 2022-08-21 by the reprex package (v2.0.1)
CodePudding user response:
We can write a custom function to check each row of the data. Apply the function to each row using apply
.
check_row <- function(x) {
#Return 1 if any value is 1
if(any(x == 1, na.rm = TRUE)) return(1)
#return 0 if all the values are 0
if(all(x %in% 0)) return(0)
#else return NA
else NA
}
df <- cbind(df, x = apply(df, 1, check_row))
df
# a b c x
#[1,] 1 0 0 1
#[2,] 0 1 1 1
#[3,] 0 0 0 0
#[4,] NA NA NA NA
#[5,] 0 NA NA NA
#[6,] 1 0 NA 1
CodePudding user response:
Here is a dplyr
option, where we can convert to a dataframe, then use case_when
to apply your requirements, then covert back to a matrix.
library(dplyr)
as.data.frame(df) %>%
mutate(x = rowSums(across(everything()), na.rm = T),
x = case_when(x >= 1 ~ 1,
x == 0 & if_any(everything(), is.na) ~ NA_real_,
TRUE ~ 0)) %>%
as.matrix.data.frame()
Output
a b c x
[1,] 1 0 0 1
[2,] 0 1 1 1
[3,] 0 0 0 0
[4,] NA NA NA NA
[5,] 0 NA NA NA
[6,] 1 0 NA 1