My data is formatted in R as follows:
Group Light Dark
1 Dermaptera 29 29
2 Oniscidea 72 54
3 Diptera 54 39
4 Lepidoptera 17 7
5 Formicidae 14 6
6 Hemiptera 3 9
7 Diplopoda 8 17
I am certain that this data is not normally distributed as it is count data, and histograms show that it is clearly non-normal, for example using hist (dataframe$Light)
. When I try to input shapiro.test (dataframe)
, I get the error code is.numeric(x) is not TRUE
, and when I try to instead input shapiro.test (dataframe$Light)
or shapiro.test (dataframe$Light dataframe$Dark
, the p-value shows the result to be of normal distribution.
How should I instead format this data when putting it into R so that I can test for normality and subsequently test for statistically significant relationships?
CodePudding user response:
library(dplyr)
library(stats)
Provided your data sits here:
dd <- data.frame(Group = LETTERS[1:10], Light = sample(1:10), Dark = sample(1:10))
Shapiro test p-value for each specified column:
pvalues <- dd %>%
summarise_at(vars(Light, Dark), funs(shapiro.test(.)$p.value))
You can even find Shapiro test p-value groupwise:
pvalues_by_group <- dd %>%
group_by(Group) %>%
summarise_at(vars(Light, Dark), funs(shapiro.test(.)$p.value))