Home > Back-end >  creating a dummy, testing for na values in several columns
creating a dummy, testing for na values in several columns

Time:11-13

Im trying to make the following argument:

If df$greenbond = 0 --> Put NA, if not --> test whether any missing values in the columns 2 to 3 (original df has lots of more columns so i need an efficient way of coding this), if yes, put FALSE, if not, put TRUE. But this code give me argument 2 missing. Anyone that can help me?

greenbond <- c(1,0,1)
A <- c(1,0,NA)
B <- c(1,0,0)

df <- data.frame(greenbond,A,B)


df$test <- ifelse(df$greenbond==0,NA, ifelse(is.na(df[2:3],),FALSE,TRUE))    

CodePudding user response:

Here is an apply solution, no ifelse's.

df$test <- !apply(df[2:3], 1, anyNA)
is.na(df$test) <- df$greenbond == 0
df
#  greenbond  A B  test
#1         1  1 1  TRUE
#2         0  0 0    NA
#3         1 NA 0 FALSE

Another solution, with the more performant rowSums:

df$test <- !is.na(rowSums(df[2:3]))
is.na(df$test) <- df$greenbond == 0

The result is the same as above but for larger data sets rowSums is much better than an apply loop.

CodePudding user response:

There are a few issues with your code. Please see this answer and adapt your code: (like position of comma or data.frame instead of as.data.frame, lack of no answer in first ifelse statement):

greenbond <- c(1,0,0)
A <- c(1,0,NA)
B <- c(1,0,0)

df <- data.frame(greenbond,A,B)


df$test <- ifelse(df$greenbond==0,NA,greenbond) 

df$test
> df$test
[1]  1 NA NA

ifelse(is.na(df[2:3,]),FALSE,TRUE) 
 
  greenbond     A    B  test
2      TRUE  TRUE TRUE FALSE
3      TRUE FALSE TRUE FALSE
  •  Tags:  
  • r
  • Related