Home > Enterprise >  Calculating how many rows have at least three NA
Calculating how many rows have at least three NA

Time:10-11

I need to find out how many respondents have at least three missing responses.

My data set looks something like this.

data = tibble(
  v65 = sample(x = c("1", "2", "3", "4", NA), size = 100, replace = TRUE),
  v66 = sample(x = c("1", "2", "3", "4", NA), size = 100, replace = TRUE),
  v67 = sample(x = c("1", "2", "3", "4", NA), size = 100, replace = TRUE),
  v68 = sample(x = c("1", "2", "3", "4", NA), size = 100, replace = TRUE),
  v69 = sample(x = c("1", "2", "3", "4", NA), size = 100, replace = TRUE),
  v70 = sample(x = c("1", "2", "3", "4", NA), size = 100, replace = TRUE)
)
> data
# A tibble: 100 × 6
   v65   v66   v67   v68   v69   v70  
   <chr> <chr> <chr> <chr> <chr> <chr>
 1 3     3     3     2     NA    2    
 2 NA    1     3     1     3     4    
 3 2     1     2     1     4     4    
 4 1     1     1     4     2     4    
 5 2     1     4     3     3     1    
 6 4     3     4     3     NA    1    
 7 2     NA    NA    NA    2     NA   
 8 NA    2     NA    NA    1     NA   
 9 4     4     4     3     NA    NA   
10 2     3     3     4     2     4    
# … with 90 more rows

I need to find out how many rows have at least three NA

I have tried

data %>% 
  rowwise()%>%
  filter(is.na()>=3)

but received error

Error: Problem with `filter()` input `..1`.
ℹ Input `..1` is `is.na() >= 3`.
x 0 arguments passed to 'is.na' which requires 1
ℹ The error occurred in row 1.
Run `rlang::last_error()` to see where the error occurred.

I have also tried

EU_value_study %>% 
  filter(is.na(v65:v71)) %>% 
  tally() %>% 
  filter(n > 2)

but also received error

Error: Problem with `filter()` input `..1`.
ℹ Input `..1` is `is.na(v65:v71)`.
x Input `..1` must be of size 56368 or 1, not size 2.
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning messages:
1: In v65:v71 :
  numerical expression has 56368 elements: only the first used
2: In v65:v71 :
  numerical expression has 56368 elements: only the first used

Could anyone let me know how to get pass this? Thank you :)

CodePudding user response:

You could use the rowSums function:

na_3 <- rowSums(is.na(data)) > 3

and to get the number of rows with at least 3 NAs:

sum(na_3)

CodePudding user response:

Using tidyverse

library(dplyr)
library(purrr)
data %>% 
   filter(across(everything(), is.na) %>%
        reduce(` `) %>%
        magrittr::is_greater_than(3)) %>% 
   nrow
[1] 2
  •  Tags:  
  • r
  • Related