Here is my reproducible sample:
set.seed(42)
n <- 1000
dat <- data.frame(Participant=1:20,
Environment=rep(LETTERS[1:2], n/2),
Condition=rep(LETTERS[25:26], n/2),
Gate= sample(1:5, n, replace=TRUE),
Block = sample(1:2, n, replace=TRUE),
Sound=rep(LETTERS[3:4], n/2),
Correct=sample(0:1, n, replace=TRUE)
)
From this dataset, I am trying to analyze at the participant-level, not the item-level. I Am trying to achieve this by transforming the dataset like this:
Participant_Data<- dat%>%
group_by(Condition, Gate, Sound, Participant) %>%
summarize(Accuracy = mean(Correct),
se = sd(Correct)/sqrt(length(Correct)))
Then I am making a graph with this new dataset:
Participant_Data%>%
group_by(Condition, Gate, Sound) %>%
summarize(Proportion_Correct = mean(Accuracy),
standarderror = sd(Proportion_Correct)/sqrt(length(Proportion_Correct))) %>%
ggplot(aes(x = Gate, y = Proportion_Correct, color = Sound, group = Sound))
geom_line()
geom_errorbar(aes(ymin = Proportion_Correct - standarderror, ymax = Proportion_Correct standarderror), color = "Black", size = .15, width = .3)
geom_point(size = 2)
scale_y_continuous(labels = scales::percent)
facet_wrap(~Condition)
theme_minimal()
scale_color_brewer(palette = "Set1")
But as you will see, my error values are coming up as NA, and therefore are not showing up on my graph. Let me know if you can see what I am not seeing, and thanks in advance!
CodePudding user response:
As pointed out by @MrFlick in the comments the issue is that using sd(Proportion_Correct)
you are trying to compute a standard deviation for a vector of length 1 which will return NA
.
Instead I would suggest to compute the standard error as sd(Accuracy, na.rm = TRUE)/sqrt(n())
which looks more like the natural way to compute the standard error given that Proportion_Correct
is computed as mean(Accuracy)
.
library(dplyr)
library(ggplot2)
Participant_Data1 <- Participant_Data%>%
group_by(Condition, Gate, Sound) %>%
summarize(Proportion_Correct = mean(Accuracy),
standarderror = sd(Accuracy, na.rm = TRUE)/sqrt(n()))
#> `summarise()` has grouped output by 'Condition', 'Gate'. You can override using
#> the `.groups` argument.
ggplot(Participant_Data1, aes(x = Gate, y = Proportion_Correct, color = Sound, group = Sound))
geom_line()
geom_errorbar(aes(ymin = Proportion_Correct - standarderror, ymax = Proportion_Correct standarderror), color = "Black", size = .15, width = .3)
geom_point(size = 2)
scale_y_continuous(labels = scales::percent)
facet_wrap(~Condition)
theme_minimal()
scale_color_brewer(palette = "Set1")