I have a data frame like this: ID is the primary key and Apples is the number of apples that person has.
ID | Apples |
---|---|
E1 | 10 |
E2 | 5 |
E3 | NA |
E4 | 5 |
E5 | 8 |
E6 | 12 |
E7 | NA |
E8 | 4 |
E9 | NA |
E10 | 8 |
I want to group NA and non-NA values into only 2 separate groups and get the count of each. I tried the normal group_by(), but it does not give me desired output.
Fruits %>% group_by(Apples) %>% summarize(n())
Apples n()
<dbl> <int>
4 1
5 2
8 2
10 1
12 1
NA 3
My desired output:
Apples n()
<dbl> <int>
non-NA 7
NA 3
CodePudding user response:
We can create a group for NA
and non-NA using group_by
, and we can also make it a factor so that we can change the labels in the same step. Then, get the number of observations for each group.
library(dplyr)
df %>%
group_by(grp = factor(is.na(Apples), labels=c("non-NA", "NA"))) %>%
summarise(`n()`= n())
# grp `n()`
# <fct> <int>
#1 non-NA 7
#2 NA 3
Or in base R, we could use colSums
:
data.frame(Apples = c("non-NA", "NA"), n = c(colSums(!is.na(df))[2], colSums(is.na(df))[2]), row.names = NULL)
Data
df <- structure(list(ID = c("E1", "E2", "E3", "E4", "E5", "E6", "E7",
"E8", "E9", "E10"), Apples = c(10L, 5L, NA, 5L, 8L, 12L, NA,
4L, NA, 8L)), class = "data.frame", row.names = c(NA, -10L))
CodePudding user response:
In base R
, this can be done with table
on a logical vector
table(!is.na(df1$Apples))