Home > other >  Why won't scale_fill_stepsn work with my data?
Why won't scale_fill_stepsn work with my data?

Time:04-15

I can't, for the life of me, figure out how to get scale_fill_stepsn to work with my data. I want 7 binned colors, which I am providing. I keep getting either some weird version of my colors (which is bad), fewer than 7 bins or something just doesn't work. It does work with the default breaks, but unfortunately those breaks don't result in a lot of variation (I'd like to have roughly the same number of points in each bin).

If I plot using this code, it gets 7 different colors, but they're not the colors I specified:

  pal <- c("#EDF8FB", "#BFD3E6", "#9EBCDA", "#8C96C6", "#8C6BB1", "#88419D", "#6E016B")
  data <- data.frame(x = rnorm(161, .04, .02), y = rnorm(161, .04, .02),
            col = c(0.164, 0.163, 0.128, 0.124, 0.123, 0.121, 0.12, 0.111, 0.105, 
                    0.104, 0.104, 0.103, 0.1, 0.1, 0.093, 0.089, 0.089, 0.088, 0.086, 
                    0.084, 0.084, 0.083, 0.083, 0.082, 0.081, 0.077, 0.075, 0.075, 
                    0.074, 0.074, 0.073, 0.072, 0.072, 0.072, 0.071, 0.07, 0.069, 
                    0.069, 0.068, 0.067, 0.067, 0.066, 0.066, 0.066, 0.064, 0.063, 
                    0.063, 0.063, 0.062, 0.061, 0.06, 0.06, 0.06, 0.059, 0.059, 0.058, 
                    0.057, 0.057, 0.057, 0.057, 0.056, 0.056, 0.055, 0.054, 0.054, 
                    0.052, 0.051, 0.051, 0.051, 0.051, 0.051, 0.05, 0.049, 0.049, 
                    0.049, 0.049, 0.048, 0.048, 0.047, 0.047, 0.047, 0.047, 0.047, 
                    0.046, 0.045, 0.045, 0.044, 0.044, 0.044, 0.044, 0.043, 0.043, 
                    0.043, 0.043, 0.042, 0.042, 0.042, 0.042, 0.042, 0.042, 0.041, 
                    0.041, 0.041, 0.04, 0.04, 0.04, 0.04, 0.04, 0.04, 0.04, 0.039, 
                    0.039, 0.039, 0.038, 0.038, 0.037, 0.037, 0.036, 0.036, 0.036, 
                    0.035, 0.035, 0.035, 0.035, 0.034, 0.034, 0.033, 0.033, 0.033, 
                    0.032, 0.032, 0.032, 0.032, 0.031, 0.031, 0.031, 0.031, 0.03, 
                    0.03, 0.029, 0.029, 0.028, 0.028, 0.027, 0.026, 0.026, 0.025, 
                    0.023, 0.023, 0.022, 0.022, 0.021, 0.021, 0.02, 0.016, 0.016, 
                    0.015, 0.015, 0.01, 0.009, 0.004))
  
  ggplot(data)  
    geom_point(aes(x = x, y = y, fill = col), size = 10, pch = 21)  
    scale_fill_stepsn(colors = pal, n.breaks = 7, breaks = 1:6 * .01)

Plot 1

It may be easier to see what's happening if you change to a different palette:

pal <- c("#E41A1C", "#377EB8", "#4DAF4A", "#984EA3", "#FF7F00", "#FFFF33", "#A65628")

Plot2

So it's blending the first 4 colors or something like that and then using the maximum color. It doesn't happen on fake data, so it's obviously related to this particular dataset. Any ideas of how I can fix it?

CodePudding user response:

You need the breaks to be set at your maximum and minimum values, and you need 8 of them (the length of your palette 1). This will divide your colours into 7 groups with equal ranges.

To specify that you don't want the colors interpolated, you need to map them to the centre of each bin. You can use the values argument for this, which maps values between 0 and 1 onto the scale. The first point should be halfway between 0 and 1/7, the second halfway between 1/7 and 2/7, etc.

Personally, I would probably use cut on the col values and use scale_fill_manual, but here's how to do it with scale_fill_stepsn

ggplot(data)  
    geom_point(aes(x = x, y = y, fill = col), size = 10, pch = 21)  
    scale_fill_stepsn(colors = pal, 
                      breaks = seq(min(data$col), max(data$col), length = 8),
                      values = seq(1/14, 13/14, length = 7))

enter image description here


EDIT

To calculate the values for any set of breaks correctly, you can use these two helper functions:

get_breaks <- function(vec, breaks) {
  c(min(vec), breaks, max(vec))
}

get_values <- function(vec, breaks) {
  breaks <- get_breaks(vec, breaks)
  vals <- (diff(breaks)/2   head(breaks, -1) - min(breaks))/diff(range(vec))
  vals[length(vals)] <- vals[length(vals)]   .Machine$double.eps
  vals
}

Which allow:

ggplot(data)  
  geom_point(aes(x = x, y = y, fill = col), size = 10, pch = 21)  
  scale_fill_stepsn(colors = pal, 
                    breaks = get_breaks(data$col, 1:6 * 0.01),
                    values = get_values(data$col, 1:6 * 0.01))

enter image description here

  • Related