I am trying to create a boxplot using ggplot(data, aes(x, y)) geom_boxplot
, but I have problems right at the beginning. My data looks similar to this:
S1 S2 S3 S4
1 4.28 4.34 4.39 4.29
2 4.13 4.11 4.62 4.91
3 4.39 4.47 4.45 4.98
4 4.09 4.98 4.18 4.01
5 4.93 4.11 4.21 4.31
6 4.26 4.15 4.62 4.04
7 4.20 4.89 4.99 4.32
8 4.19 4.16 4.76 4.89
9 4.13 4.64 4.27 4.94
10 4.20 4.87 4.47 4.05
11 4.22 4.18 4.57 4.75
12 4.23 4.08 4.27 4.41
13 4.26 4.25 4.23 4.39
14 4.23 4.01 4.26 4.17
15 4.46 4.19 4.92 4.16
16 4.27 4.15 4.50 4.85
17 4.06 4.42 4.57 4.37
18 4.14 4.36 4.47 4.47
19 4.43 4.21 4.11 4.67
20 4.29 4.15 4.56 4.26
The boxplot shoud have the five different species at the x-axis and the values on the y-axis.
Unfortunately, I get an error every time I define the aes(x,y)
with species (species <- colnames(data)
) as x
: ! Aesthetics must be either length 1 or the same as the data (20): x. Furthermore, I don't know what exactly to use as y
and how to code it, since I am new to R.
I also reshaped the data into a long format using pivot_longer
from the tidyr
package, like it is suggested in another question. However, this does not result in a boxplot either.
If I use the basic boxplot()
, the outcome is fine, but I neet to enhance the boxplot visually with ggplot.
I am very greatful to everyone who can help!
CodePudding user response:
You are on the right track. If your data frame is called data
you can do:
library(tidyverse)
data %>%
pivot_longer(everything(), names_to = 'Species') %>%
ggplot(aes(Species, value))
geom_boxplot()
Or, if you want to get fancy,
data %>%
pivot_longer(everything(), names_to = 'Species') %>%
ggplot(aes(Species, value, color = Species))
geom_point(position = position_jitter(width = 0.1), alpha = 0.5)
geom_boxplot(fill = NA, outlier.color = NA)
theme_minimal(base_size = 20)
scale_color_brewer(palette = 'Set1')
Created on 2022-12-15 with reprex v2.0.2
Data in reproducible format
data <- structure(list(S1 = c(4.28, 4.13, 4.39, 4.09, 4.93, 4.26, 4.2,
4.19, 4.13, 4.2, 4.22, 4.23, 4.26, 4.23, 4.46, 4.27, 4.06, 4.14,
4.43, 4.29), S2 = c(4.34, 4.11, 4.47, 4.98, 4.11, 4.15, 4.89,
4.16, 4.64, 4.87, 4.18, 4.08, 4.25, 4.01, 4.19, 4.15, 4.42, 4.36,
4.21, 4.15), S3 = c(4.39, 4.62, 4.45, 4.18, 4.21, 4.62, 4.99,
4.76, 4.27, 4.47, 4.57, 4.27, 4.23, 4.26, 4.92, 4.5, 4.57, 4.47,
4.11, 4.56), S4 = c(4.29, 4.91, 4.98, 4.01, 4.31, 4.04, 4.32,
4.89, 4.94, 4.05, 4.75, 4.41, 4.39, 4.17, 4.16, 4.85, 4.37, 4.47,
4.67, 4.26)), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15",
"16", "17", "18", "19", "20"))