I have a data frame containing 5 probes which are my variables in a dataframe, cg02823866, cg13474877, cg14305799, cg15837913 and cg19724470. I want to create a boxplot that will group cg02823866 and cg14305799 into a group called 'GeneBody' and then cg13474877, cg14305799 and cg19724470 into a group called 'Promoter'. I then want to colour code the boxplots to represent the probe names. I can't figure out how to group those variables into groups to plot the graph.
I created an ungrouped boxplot of the five probes and it looked like this.
I want there to be the titles 'Promoter' and 'GeneBody' on the x axis. Above the 'GeneBody' title there are the 2 boxplots for the cg02823866 and cg14305799 probes. Then a 'Promoter' label with the boxplots for cg13474877, cg14305799 and cg19724470. I then want each boxplots colour coded to represent each different probe. My data frame that I imported into RStudio looks like this: https://i.stack.imgur.com/r4gEC.png
CodePudding user response:
Assuming you have some data with variable names Beta
(your y axis), Probe
(your current x axis), and group
(either "GeneBody" or "Promoter"), you can do something like the following:
library(ggplot2)
ggplot(data, aes(x = group, y = Beta, fill = Probe))
geom_boxplot()
If you provide a reproducible set of data, I can probably do better.
CodePudding user response:
Adding to Ben's answer the traditional iris-data.frame example,which you can easily load by data(iris):
ggplot(iris)
aes(x = "", y = Sepal.Length, group = Species)
geom_boxplot(shape = "circle", fill = "#112446")
theme_minimal()
So you just need a column which indicates the group dependency. It gets of course more difficult with uncleand data, where you might need to transpond the data first etc. But those are follow up questions i guess.
Also if you want to make your life easier, use esquisse R-Studio add-on Boxplot