Home > Mobile >  R count the number of occurrences of a specific value within each column of dataframe
R count the number of occurrences of a specific value within each column of dataframe

Time:11-17

This seems like a really basic question but I can't find a solution that will do what I want for all columns of a dataframe.

I have a dataframe:

df = data.frame(cats = c("A", "B", "C", NA, NA), dogs = c(-99, "F", NA, -99, "H"))

Where I want to count the number of times NA occurs within each column. I also want to count the number of times -99 occurs within each column. I am able to use summarise_all to count the number of NAs per column.

df %>% summarise_all(~ sum(is.na(.)))

Which produces the desired result:

  cats dogs
  2    1

But I can't figure out how to adapt this to count the number of times -99 appears per column. I've tried the following:

df %>% summarise_all(~ sum(-99))

Which produces this result:

  cats dogs
  -99  -99

This result shows -99 for each column, even though it never occurs within cats, and it doesn't produce the number of times -99 occurs. There must be an easy way to do this? Thanks for any help!

CodePudding user response:

You almost get there, you need to use na.rm = TRUE inside sum

> df %>% summarise_all(~ sum(.== -99, na.rm = TRUE))
  cats dogs
1    0    2

CodePudding user response:

Using base R

colSums(df == -99, na.rm = TRUE)
cats dogs 
   0    2 
  • Related