This is a question about ggplot. The context is data from bootstrapped resamples to be compared with a hypothetical distribution. After box-plotting the bootstrapped data, I would like to overlay a line of expected proportions. The ggplot code below produces:
Error: Aesthetics must be either length 1 or the same as the data (20): y
boot1 <- data.table(digit = 1, prop = runif(10, 0.25, 0.35))
boot2 <- data.table(digit = 2, prop = runif(10, 0.12, 0.25))
boots <- rbindlist(list(boot1, boot2))
ggplot(boots, aes(x = as.factor(digit), y = prop))
geom_boxplot()
geom_line(aes(x = as.factor(digit), y = c(0.3, 0.17)))
In a realistic example, the y values of the line plot would use the values produced by a non-linear function.
Thank you for your attention.
CodePudding user response:
For your example you can try geom_segment()
because you don't have a continuous line, but rather segments. So each of your factors will be encoded 1,2,3 on the x-axis, if you have 3 categories, then you need to create a date frame with digit = 1:3 :
mean_data = data.frame(digit = 1:2,prop = c(0.3,0.17))
ggplot(boots, aes(x = factor(digit), y = prop))
geom_boxplot()
geom_segment(data = mean_data,
aes(x = digit - 0.3,xend = digit 0.3,y=prop,yend=prop),col="blue")
CodePudding user response:
As another spin on the segmentation approach, I tried geom_curve with intervals equal to my x-axis categories.
geom_curve(x = 1, y = 0.3, xend = 2, yend = 0.17, curvature = 0.1, color = 2)
and the result is
Its not elegant, particularly with multiple categories. Thank you @StupidWolf for the assistance.