I have something weird happening with a heatmap in ggplot2. I melted a confusion matrix in order to plot it with ggplot2 with the geom_tile attribute. The values pasted with the geom_text on the tiles are correct. However, with (or without) the default color scale I get different colors for the same value. For instance, see in the image below the value 0.04 (Row Urban). I don't understand exactly what's happening here.
What am I doing wrong?
Thank you in advance!
The first 8 rows of the dataframe:
> head(zoo_cm.df, 8)
Prediction Reference Freq
1 Castle/Castrum Castle/Castrum 0.75
2 Religious Castle/Castrum 0.00
3 Rural Castle/Castrum 0.05
4 Rural site, villa Castle/Castrum 0.00
5 Urban Castle/Castrum 0.04
6 Urban, amphitheater Castle/Castrum 0.00
7 Castle/Castrum Religious 0.00
8 Religious Religious 1.00
dput of my dataframe:
> dput(zoo_cm.df)
structure(list(Prediction = structure(c(1L, 2L, 3L, 4L, 5L, 6L,
1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L,
5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L), levels = c("Castle/Castrum",
"Religious", "Rural", "Rural site, villa", "Urban", "Urban, amphitheater"
), class = "factor"), Reference = structure(c(1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L,
4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L
), levels = c("Castle/Castrum", "Religious", "Rural", "Rural site, villa",
"Urban", "Urban, amphitheater"), class = "factor"), Freq = c(0.75,
0, 0.05, 0, 0.04, 0, 0, 1, 0.05, 0, 0.03, 0, 0.17, 0, 0.74, 0,
0.15, 0, 0, 0, 0.05, 0.33, 0.04, 0, 0.08, 0, 0.11, 0.67, 0.72,
0, 0, 0, 0, 0, 0.01, 1)), row.names = c(NA, -36L), class = "data.frame")
Plot code:
zoo_test_cm.plot <- ggplot(zoo_cm.df, aes(x=Prediction, y=Reference))
geom_tile(aes(fill=Freq))
geom_text(aes(Reference, Prediction, label = Freq), color = "black", size = 4)
scale_fill_gradient(low = "#FFFFFF", high = "#A6D1E6")
theme_minimal()
theme(axis.text.x = element_text(angle = 45, vjust = 1,
size = 10, hjust = 1), legend.position = "right")
coord_fixed()
CodePudding user response:
Your aesthetics are swapped.
geom_tile
is based onaes(x=Prediction, y=Reference)
(from the initialggplot
expression); butgeom_text
is based onaes(x=Reference, y=Prediction)
from your explicit mapping.
Either swap them in the text or better (imo) just remove them and let them inherit from the global:
ggplot(zoo_cm.df, aes(x=Prediction, y=Reference))
geom_tile(aes(fill=Freq))
geom_text(aes(label = Freq), color = "black", size = 4)
scale_fill_gradient(low = "#FFFFFF", high = "#A6D1E6")
theme_minimal()
theme(axis.text.x = element_text(angle = 45, vjust = 1,
size = 10, hjust = 1), legend.position = "right")
coord_fixed()
In my opinion, this is one strength in providing a "global" mapping defined in ggplot
and using it in any/all subsequent geoms. The need to explicitly override aesthetics in a follow-on geom is best justified when: (1) intentionally using a different column; or (2) using a different dataset where same-intentioned columns have different names. Even in those cases, it's often too easy to get it wrong, as in your example.