functions and error issues using tidyverse and dplyr-CodePudding

I am trying to write a function. For some reason, this function is working properly. I am trying to get this to where I can give it my parameters, and then, have the function return my values. This should not be a problem, but in its current, I am getting the following error:

Error in UseMethod("summarise") : 
  no applicable method for 'summarise' applied to an object of class "c('double', 'numeric')

As you can see, I have tried putting in the calls for the data frame in my function and where to use them. As a newbie to R, what am i missing?

Here is my minimal reproducible code with code that does work (ab_b2) but not in a function, and my attempt at the function.

library(tidyverse)
library(dplyr)

set.seed(457)
mydata <- data.frame(id=1:6, 
                    y=sample(6), estimate=rnorm(6), 
                    n=c(200,500,200,500,500, 200),
                    pop = c(.30, .30, .30, .30, .30, .30), 
                    col = c(.00, .00, .30, .30, .50, .50), 
                    term = factor(c('x1', 'x2', 'x1', 'x2', 'x1', 'x2')), 
                    source = c(1,2,3,4,1,2)
                    )

#non-function and works 
ab_b2 = mydata |> 
  filter(term == "x1") |> 
  group_by(n, col, source) |> 
  summarize(ab2 = round(mean(abs(estimate - pop)),3))

#function attempt 
bias = function(df, term, estimate, popval){
  dplyr::filter(df, term == term) |> 
    group_by(n, col, source)  
  summarize(ab = round(mean(abs(df$estimate - df$popval)),3))
}

try = bias(mydata, "x1", estimate, popval = df$b1)

I appreciate any assistance, or pointing me in the correct direction.

CodePudding user response：

The first issue with your code is that you are missing a |> after the group_by. However, even after fixing that your code will not give the desired result. First, as your dataset has a column term you have to use .env$term to refer to the function argument term. Second, using df$estimate will take the estimate column from the unfiltered and ungrouped df and df$popval will also not work. Instead use e.g. the curly-curly operator {{:

library(dplyr)

bias = function(df, term, estimate, popval){
  dplyr::filter(df, term == .env$term) |> 
    group_by(n, col, source) |> 
    summarize(ab = round(mean(abs({{estimate}} - {{popval}})),3))
}

bias(mydata, "x1", estimate, popval = pop)
#> `summarise()` has grouped output by 'n', 'col'. You can override using the
#> `.groups` argument.
#> # A tibble: 3 × 4
#> # Groups:   n, col [3]
#>       n   col source    ab
#>   <dbl> <dbl>  <dbl> <dbl>
#> 1   200   0        1  1.34
#> 2   200   0.3      3  1.18
#> 3   500   0.5      1  1.17

# CHECK 

mydata |> 
  filter(term == "x1") |> 
  group_by(n, col, source) |> 
  summarize(ab2 = round(mean(abs(estimate - pop)),3))
#> `summarise()` has grouped output by 'n', 'col'. You can override using the
#> `.groups` argument.
#> # A tibble: 3 × 4
#> # Groups:   n, col [3]
#>       n   col source   ab2
#>   <dbl> <dbl>  <dbl> <dbl>
#> 1   200   0        1  1.34
#> 2   200   0.3      3  1.18
#> 3   500   0.5      1  1.17