Home > Blockchain >  How to output a default dataframe when a condition is not met in dplyr?
How to output a default dataframe when a condition is not met in dplyr?

Time:07-30

Let's start with this myData dataframe generated by the code immediately beneath:

  Element Group
1       C     4
2       C     1
3       D     3
4       C     8

myData <-
  data.frame(
    Element = c("C","C","D","C"),
    Group = c(4,1,3,8)
  )

The dplyr code below counts the number of instances a selected element appears in the Element column of myData, but note that the code below is searching for an element "R" that doesn't appear in myData:

library(dplyr)
rCount <- myData %>% filter(Element == 'R') %>% count(Element, name = 'counted')

Here is what is returned when rCount is run against myData:

> rCount
[1] Element counted
<0 rows> (or 0-length row.names)

How do I modify the rCount code so that the following default data frame is returned when element R (or any other unlisted element) doesn't appear in the myData dataframe?

> rCount
  Element counted
1       R       0

CodePudding user response:

Write a wrapper function to do the logic for you:

myCountFunc <- function(df, filterValue) {
  x <- df %>% 
    filter(Element == filterValue) %>% 
    count(Element, name="counted")
  if (nrow(x) == 0) {
    x <- tibble(Element=filterValue, counted=0)
  } 
  x
}

So that

myData %>% myCountFunc("R")
# A tibble: 1 × 2
  Element counted
  <chr>     <dbl>
1 R             0

and

myData %>%  myCountFunc("C")
  Element counted
1       C       3

CodePudding user response:

We could use summarise with conditional sum():

library(dplyr)

rCount <- myData %>% 
  summarise(counted = sum(Element == 'R'), Element = 'R') %>%
  select(2,1)
 Element counted
1       R       0

CodePudding user response:

In R we have factor variables where we can set levels that are not present in the actual data.

In your case, we could change Elements into a factor with R as an additional level and then we can run count(Element, .drop = FALSE) to include factor levels that are not present in the data in count:


library(dplyr)

myData %>% 
  mutate(Element = factor(Element, levels = c(unique(Element), "R"))) %>% 
  count(Element, .drop = FALSE)

#>   Element n
#> 1       C 3
#> 2       D 1
#> 3       R 0

Created on 2022-07-30 by the reprex package (v0.3.0)

  • Related