Home > Software engineering >  Why does group_by() in dplyr not work when switching from sum to counting unique occurrences?
Why does group_by() in dplyr not work when switching from sum to counting unique occurrences?

Time:12-26

The below MWE code works fine for summing data frame values, whereby the user selects which type of period to group by in the radio buttons in the "Sum the data table columns:" section rendered at the bottom. This grouping is performed in the summed_data() object below in the server section.

However I'm also trying to count the number of occurrences where Period_2 == 1. When I comment out the currently uncommented summed_data() section below, and uncomment the currently commented-out summed_data() for performing unique row counts, and try running the code, it fails. But if I run this unique row count function in the R console, as shown immediately below, it works fine and gives the desired results (manually changing the "Period..." in the group_by(...) section)!

   data <- data.frame(
      ID = c(115,115,111,88,120,16),
      Period_1 = c("2020-01", "2020-02", "2020-03", "2020-01", "2020-02", "2020-03"),
      Period_2 = c(1, 2, 3, 1, 1, 4),
      ColA = c(1000.01, 20, 30, 40, 50, 60),
      ColB = c(15.06, 25, 35, 45, 55, 65)
    )
    
    filter(data, Period_2 == "1") %>%
    group_by(Period_1) %>%
    summarise(count = length(unique(ID)))

 Period_1 count
  <chr>    <int>
1 2020-01      2
2 2020-02      1

So, to me the problems appears to lie in the dplyr code below group_by(!!sym(input$grouping)). Does anyone have a suggestion for solving this?

MWE code:

library(dplyr)
library(DT)
library(shiny)
library(shinyWidgets)
library(tidyverse)

ui <-
  fluidPage(
    fluidRow(
      column(width = 8,
          h3("Data table:"),
          tableOutput("data"),
          h3("Sum the data table columns:"),
          radioButtons(
            inputId = "grouping",
            label = NULL,
            choiceNames = c("By period 1", "By period 2"),
            choiceValues = c("Period_1", "Period_2"),
            selected = "Period_1",
            inline = TRUE
          ),
          DT::dataTableOutput("sums")
      )
    )
  )

server <- function(input, output, session) {
  data <- reactive({
    data.frame(
      ID = c(115,115,111,88,120,16),
      Period_1 = c("2020-01", "2020-02", "2020-03", "2020-01", "2020-02", "2020-03"),
      Period_2 = c(1, 2, 3, 1, 1, 4),
      ColA = c(1000.01, 20, 30, 40, 50, 60),
      ColB = c(15.06, 25, 35, 45, 55, 65)
    )
  })
  
  colNames <- reactive({c(input$grouping, "Col A", "Col B") })
  
  # summed_data <- reactive({
  #   filter(data(), Period_2 == "1") %>%
  #     group_by(!!sym(input$grouping)) %>%
  #     summarise(count = length(unique(ID)))
  # })
  
  summed_data <- reactive({
    data() %>%
      group_by(!!sym(input$grouping)) %>%
      select("ColA","ColB") %>%
      summarise(across(everything(), sum))
  })
  
  output$data <- renderTable(data())

  output$sums <- renderDT({
    summed_data() %>% 
      datatable(
        rownames = FALSE,
        colnames=colNames() # < add colNames()
        )
  })
  
}

shinyApp(ui, server)

CodePudding user response:

The problem were the colNames() that you defined and added to your call to datatable. I commented those lines out and it works. The problem didn't arise with your sum data.frame because here the colnames were actually present in the data.frame, which is not the case in the length(unique)) data.frame.

library(dplyr)
library(DT)
library(shiny)
library(shinyWidgets)
library(tidyverse)

ui <-
  fluidPage(
    fluidRow(
      column(width = 8,
             h3("Data table:"),
             tableOutput("data"),
             h3("Sum the data table columns:"),
             
             radioButtons(
               inputId = "grouping",
               label = NULL,
               choiceNames = c("By period 1", "By period 2"),
               choiceValues = c("Period_1", "Period_2"),
               selected = "Period_1",
               inline = TRUE
             ),
             DT::dataTableOutput("sums")
      )
    )
  )

server <- function(input, output, session) {
  mydat <- reactive({
    data.frame(
      ID = c(115,115,111,88,120,16),
      Period_1 = c("2020-01", "2020-02", "2020-03", "2020-01", "2020-02", "2020-03"),
      Period_2 = c(1, 2, 3, 1, 1, 4),
      ColA = c(1000.01, 20, 30, 40, 50, 60),
      ColB = c(15.06, 25, 35, 45, 55, 65)
    )
  })
  
  # colNames <- reactive({c(input$grouping, "Col A", "Col B") })
  
  summed_data <- reactive({
    print(input$grouping)
    mydat() %>%
      dplyr::filter(Period_2 == 1) %>%
      dplyr::group_by(!!sym(input$grouping)) %>% 
      dplyr::summarise(count = length(unique(ID)))
  })
  
  # summed_data <- reactive({
  #   print(input$grouping)
  #   data() %>%
  #     group_by(across(all_of(input$grouping))) %>%
  #     select("ColA","ColB") %>%
  #     summarise(across(everything(), sum))
  # })
  
  output$data <- renderTable(mydat())
  
  output$sums <- renderDT({
    summed_data() %>% 
      datatable(
        rownames = FALSE,
        # colnames=colNames() # < add colNames()
      )
  })
  
}

shinyApp(ui, server)
  • Related