I observe a conflict between numbering the values and ordering the group in Ggplot.
Dears,
Here is a sample of ma dataset dput(IP[1:10, ])
:
structure(list(accession = c("AT5G23310", "ATCG00740", "AT4G20130",
"AT5G51100", "AT3G06730", "AT2G28000", "AT2G24020", "AT1G73990",
"AT5G20720", "AT5G45390"), name = c("FSD3 / PAP4", "RPOA", "PTAC14 / PAP7",
"FSD2 / PAP9", "CITRX / PAP10", "CPN60A1", "STIC2", "SPPA", "CPN20",
"CLPP4"), description = c("Fe superoxide dismutase 3", "RNA polymerase subunit alpha",
"plastid transcriptionally active 14", "Fe superoxide dismutase 2",
"Thioredoxin z", "chaperonin-60alpha", "Uncharacterised BCR, YbaB family COG0718",
"signal peptide peptidase", "chaperonin 20", "CLP protease P4"
), class = c("int_D", "int_D", "int_D", "int_D", "int_D", "int_D",
"int_D", "int_D", "int_D", "int_D"), FC = c(10.8808319521963,
10.8048308965242, 10.4457101811235, 10.399581594615, 9.76710767914034,
8.40981567320428, 8.09336699899536, 7.39700419044091, 7.36589576056924,
7.24457380682909), iBAQ = c(0.12855586361859, 0.595067840872386,
0.403067430310179, 0.371518817592689, 0.584834508323074, 0.0271550563144128,
0.0088451761756162, 0.00151518236884624, 0.0104882385666527,
0.00327673100220722), thylakoid = c("thylakoid", "thylakoid",
"thylakoid", "thylakoid", "thylakoid", "thylakoid", "thylakoid",
"thylakoid", "thylakoid", "thylakoid")), row.names = c(NA, -10L
), class = c("tbl_df", "tbl", "data.frame"))
I try to generate a violin plot and boxplot with grouped values. I can number the values for each group (script1) but the order of the group is not respected. The function
mutate(class = fct_relevel(class,"int_D", "prox_D","int_L","prox_L")) %>%
doesn't works in that script:
Script 1 : the order of the group is not respected but I can number the values for each class
# sample size
sample_size = IP %>% group_by(class) %>% summarize(num=n())
IP %>%
left_join(sample_size) %>%
mutate(class = fct_relevel(class,"int_D", "prox_D","int_L","prox_L"))%>%
mutate(class = paste0(class, "\n", "n=", num)) %>%
ggplot( aes(x=class, y=FC, fill = class))
geom_violin(trim = FALSE, width=0.5, color="grey", size=0.1)
geom_boxplot(width=0.1, fill="white", alpha=1)
scale_fill_manual(values=c("gold3","gold3","green4","green4"))
ylim(0,15)
theme_ipsum()
theme(legend.position="none", plot.title = element_text(size=11))
ggtitle("thylakoid")
xlab("")
If I remove the mutate
function, the order of the group is respected but I lost the numbering of the values
Script 2: the order of the group is respected but I lost the numbering of the values
# sample size
sample_size = IP %>% group_by(class) %>% summarize(num=n())
IP %>%
left_join(sample_size) %>%
mutate(class = fct_relevel(class,"int_D", "prox_D","int_L","prox_L"))%>%
ggplot( aes(x=class, y=FC, fill = class))
geom_violin(trim = FALSE, width=0.5, color="grey", size=0.1)
geom_boxplot(width=0.1, fill="white", alpha=1)
scale_fill_manual(values=c("gold3","gold3","green4","green4"))
ylim(0,15)
theme_ipsum()
theme(legend.position="none", plot.title = element_text(size=11))
ggtitle("thylakoid")
xlab("")
Do you have a solution to have both the values numbering and the right order?
All the best!
CodePudding user response:
The problem is after the paste
, class
is no longer a factor. You could try to mutate class
adding the number, fetch the result, get the new levels using unique(IP$class)
, sorting them however you want and convert class
to a factor again using this new levels.
CodePudding user response:
In addition to @Josep Puyo's answer, note that you can supply calculated axis labels directly. For example by adding this axis declaration to script 2:
## some ggplot instructions
scale_x_discrete(
labels = paste(
unique(IP$class),
table(IP$class),
sep = '\n'
)
) ## some more ggplot instructions
(In this context, 'class' is a somewhat unfortunate variable name as it collides with function class
.)