Home > Blockchain >  How to count the number of times a specified variable appears in a dataframe column using dplyr?
How to count the number of times a specified variable appears in a dataframe column using dplyr?

Time:07-29

Suppose we start with this very simple dataframe called myData:

> myData
  Element Class
1       A     0
2       A     0
3       C     0
4       A     0
5       B     1
6       B     1
7       A     2

Generated by:

myData = data.frame(Element = c("A","A","C","A","B","B","A"),Class = c(0,0,0,0,1,1,2))

How would I use dplyr to extract the number of times "A" appears in the Element column of the myData dataframe? I would simply like the number 4 returned, for further processing in dplyr. All I have so far is the dplyr code shown at the bottom, which seems clumsy because among other things it yields another dataframe with more information than just the number 4 that is needed:

# A tibble: 1 x 2
  Element counted
  <chr>     <int>
1 A             4

The dplyr code that produces the above tibble:

library(dplyr)
myData %>% group_by(Element) %>% filter(Element == "A") %>% summarise(counted = n())

CodePudding user response:

We can use count which simplifies the group_by summarise step

library(dplyr)
myData %>% 
  filter(Element == 'A') %>%
  count(Element, name = 'counted')

Or with just summarise and sum

myData %>%
  summarise(counted = sum(Element == 'A'), Element = 'A') %>%
  relocate(Element, .before = 1)
  Element counted
1       A       4

CodePudding user response:

Another option using tally like this:

myData = data.frame(Element = c("A","A","C","A","B","B","A"),Class = c(0,0,0,0,1,1,2))

library(dplyr)

myData %>%
  filter(Element == "A") %>%
  group_by(Element) %>%
  tally()
#> # A tibble: 1 × 2
#>   Element     n
#>   <chr>   <int>
#> 1 A           4

Created on 2022-07-28 by the reprex package (v2.0.1)

  • Related