Home > Back-end >  Getting different colors for the same value in ggplot2 heatmap
Getting different colors for the same value in ggplot2 heatmap

Time:12-13

I have something weird happening with a heatmap in ggplot2. I melted a confusion matrix in order to plot it with ggplot2 with the geom_tile attribute. The values pasted with the geom_text on the tiles are correct. However, with (or without) the default color scale I get different colors for the same value. For instance, see in the image below the value 0.04 (Row Urban). I don't understand exactly what's happening here.

What am I doing wrong?

Thank you in advance!

The first 8 rows of the dataframe:

> head(zoo_cm.df, 8)
           Prediction      Reference Freq
1      Castle/Castrum Castle/Castrum 0.75
2           Religious Castle/Castrum 0.00
3               Rural Castle/Castrum 0.05
4   Rural site, villa Castle/Castrum 0.00
5               Urban Castle/Castrum 0.04
6 Urban, amphitheater Castle/Castrum 0.00
7      Castle/Castrum      Religious 0.00
8           Religious      Religious 1.00

dput of my dataframe:

> dput(zoo_cm.df)
structure(list(Prediction = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 
1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 
5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L), levels = c("Castle/Castrum", 
"Religious", "Rural", "Rural site, villa", "Urban", "Urban, amphitheater"
), class = "factor"), Reference = structure(c(1L, 1L, 1L, 1L, 
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 
4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L
), levels = c("Castle/Castrum", "Religious", "Rural", "Rural site, villa", 
"Urban", "Urban, amphitheater"), class = "factor"), Freq = c(0.75, 
0, 0.05, 0, 0.04, 0, 0, 1, 0.05, 0, 0.03, 0, 0.17, 0, 0.74, 0, 
0.15, 0, 0, 0, 0.05, 0.33, 0.04, 0, 0.08, 0, 0.11, 0.67, 0.72, 
0, 0, 0, 0, 0, 0.01, 1)), row.names = c(NA, -36L), class = "data.frame")

Plot code:

zoo_test_cm.plot <- ggplot(zoo_cm.df, aes(x=Prediction, y=Reference))  
 geom_tile(aes(fill=Freq))  
 geom_text(aes(Reference, Prediction, label = Freq), color = "black", size = 4)  
 scale_fill_gradient(low = "#FFFFFF", high = "#A6D1E6")  
  theme_minimal() 
  theme(axis.text.x = element_text(angle = 45, vjust = 1, 
    size = 10, hjust = 1), legend.position = "right") 
 coord_fixed()

Plot output: Plot output

CodePudding user response:

Your aesthetics are swapped.

  • geom_tile is based on aes(x=Prediction, y=Reference) (from the initial ggplot expression); but
  • geom_text is based on aes(x=Reference, y=Prediction) from your explicit mapping.

Either swap them in the text or better (imo) just remove them and let them inherit from the global:

ggplot(zoo_cm.df, aes(x=Prediction, y=Reference))  
 geom_tile(aes(fill=Freq))  
 geom_text(aes(label = Freq), color = "black", size = 4)  
 scale_fill_gradient(low = "#FFFFFF", high = "#A6D1E6")  
  theme_minimal() 
  theme(axis.text.x = element_text(angle = 45, vjust = 1, 
    size = 10, hjust = 1), legend.position = "right") 
 coord_fixed()

fixed heatmap

In my opinion, this is one strength in providing a "global" mapping defined in ggplot and using it in any/all subsequent geoms. The need to explicitly override aesthetics in a follow-on geom is best justified when: (1) intentionally using a different column; or (2) using a different dataset where same-intentioned columns have different names. Even in those cases, it's often too easy to get it wrong, as in your example.

  • Related