Home > OS >  Conducting a one-way ANOVA
Conducting a one-way ANOVA

Time:10-31

I have a dataset with mesh opening measurements and the tools used to get those measurements. I want to complete a one-way anova on the data. Here's my code:

df<-structure(list(MeasurementTool = c("Wedge", "Wedge", "Wedge", 
                                   "Wedge", "Wedge", "Wedge", "Wedge", "Wedge", "Wedge", "Wedge", 
                                   "Wedge", "Wedge", "Wedge", "Wedge", "Wedge", "Wedge", "Wedge", 
                                   "Wedge", "Wedge", "Wedge", "Weighted Wedge", "Weighted Wedge", 
                                   "Weighted Wedge", "Weighted Wedge", "Weighted Wedge", "Weighted Wedge", 
                                   "Weighted Wedge", "Weighted Wedge", "Weighted Wedge", "Weighted Wedge", 
                                   "Weighted Wedge", "Weighted Wedge", "Weighted Wedge", "Weighted Wedge", 
                                   "Weighted Wedge", "Weighted Wedge", "Weighted Wedge", "Weighted Wedge", 
                                   "Weighted Wedge", "Weighted Wedge", "ICES Gauge", "ICES Gauge", 
                                   "ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge", 
                                   "ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge", 
                                   "ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge", "ICES Gauge", 
                                   "ICES Gauge", "ICES Gauge", "ICES Gauge"), 
               MeshOpening = c(157L, 155L, 160L, 160L, 161L, 160L, 158L, 161L, 162L, 162L, 160L, 163L, 
                                158L, 160L, 161L, 165L, 164L, 158L, 164L, 163L, 159L, 158L, 165L, 
                                164L, 159L, 160L, 158L, 159L, 160L, 163L, 159L, 160L, 158L, 158L, 
                                158L, 162L, 160L, 159L, 159L, 159L, 159L, 159L, 159L, 155L, 156L, 
                                156L, 158L, 160L, 156L, 155L, 160L, 160L, 157L, 159L, 158L, 155L, 
                                158L, 157L, 156L, 158L)), row.names = c(NA, -60L), class = "data.frame") 

df$`MeasurementTool`<- as.factor(df$`MeasurementTool`)

group_by(df, 'MeasurementTool') %>% summarise(count = n(), mean = mean('MeshOpening', na.rm = TRUE), sd = sd('MeshOpening', na.rm = TRUE))

It is giving me these warning messages:

Warning messages:

1: In mean.default("MeshOpening", na.rm = TRUE) : argument is not numeric or logical: returning NA

2: In var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm = na.rm) : NAs introduced by coercion

CodePudding user response:

You are getting tripped up by the way dplyr::summarise works. It's expecting an R name (a.k.a. symbol), i.e. no quotes around the letters:

group_by(df, 'MeasurementTool') %>% summarise(count = n(), mean = mean(MeshOpening, na.rm = TRUE), sd = sd(MeshOpening, na.rm = TRUE))
# A tibble: 1 × 4
  `"MeasurementTool"` count  mean    sd
  <chr>               <int> <dbl> <dbl>
1 MeasurementTool        60  159.  2.48

In the pre-tidyverse days we would often refer to columns by their character-valued names as you did, but many people seem to like thinking of column names as first class objects as is now the norm in the tidyverse.

Even better would be to solve not only the cause of the error but also to get what you really wanted:

group_by(df, MeasurementTool) %>% summarise(count = n(), 
                                          mean = mean(MeshOpening, na.rm = TRUE), 
                                          sd = sd(MeshOpening, na.rm = TRUE))
# A tibble: 3 × 4
  MeasurementTool count  mean    sd
  <fct>           <int> <dbl> <dbl>
1 ICES Gauge         20  158.  1.73
2 Wedge              20  161.  2.56
3 Weighted Wedge     20  160.  2.06

Arguably the group_by function ought to throw an error or at least a warning if the value of its second argument is not going to be interpreted to a value that matches a column name.

  • Related