I can't, for the life of me, figure out how to get scale_fill_stepsn
to work with my data. I want 7 binned colors, which I am providing. I keep getting either some weird version of my colors (which is bad), fewer than 7 bins or something just doesn't work. It does work with the default breaks, but unfortunately those breaks don't result in a lot of variation (I'd like to have roughly the same number of points in each bin).
If I plot using this code, it gets 7 different colors, but they're not the colors I specified:
pal <- c("#EDF8FB", "#BFD3E6", "#9EBCDA", "#8C96C6", "#8C6BB1", "#88419D", "#6E016B")
data <- data.frame(x = rnorm(161, .04, .02), y = rnorm(161, .04, .02),
col = c(0.164, 0.163, 0.128, 0.124, 0.123, 0.121, 0.12, 0.111, 0.105,
0.104, 0.104, 0.103, 0.1, 0.1, 0.093, 0.089, 0.089, 0.088, 0.086,
0.084, 0.084, 0.083, 0.083, 0.082, 0.081, 0.077, 0.075, 0.075,
0.074, 0.074, 0.073, 0.072, 0.072, 0.072, 0.071, 0.07, 0.069,
0.069, 0.068, 0.067, 0.067, 0.066, 0.066, 0.066, 0.064, 0.063,
0.063, 0.063, 0.062, 0.061, 0.06, 0.06, 0.06, 0.059, 0.059, 0.058,
0.057, 0.057, 0.057, 0.057, 0.056, 0.056, 0.055, 0.054, 0.054,
0.052, 0.051, 0.051, 0.051, 0.051, 0.051, 0.05, 0.049, 0.049,
0.049, 0.049, 0.048, 0.048, 0.047, 0.047, 0.047, 0.047, 0.047,
0.046, 0.045, 0.045, 0.044, 0.044, 0.044, 0.044, 0.043, 0.043,
0.043, 0.043, 0.042, 0.042, 0.042, 0.042, 0.042, 0.042, 0.041,
0.041, 0.041, 0.04, 0.04, 0.04, 0.04, 0.04, 0.04, 0.04, 0.039,
0.039, 0.039, 0.038, 0.038, 0.037, 0.037, 0.036, 0.036, 0.036,
0.035, 0.035, 0.035, 0.035, 0.034, 0.034, 0.033, 0.033, 0.033,
0.032, 0.032, 0.032, 0.032, 0.031, 0.031, 0.031, 0.031, 0.03,
0.03, 0.029, 0.029, 0.028, 0.028, 0.027, 0.026, 0.026, 0.025,
0.023, 0.023, 0.022, 0.022, 0.021, 0.021, 0.02, 0.016, 0.016,
0.015, 0.015, 0.01, 0.009, 0.004))
ggplot(data)
geom_point(aes(x = x, y = y, fill = col), size = 10, pch = 21)
scale_fill_stepsn(colors = pal, n.breaks = 7, breaks = 1:6 * .01)
It may be easier to see what's happening if you change to a different palette:
pal <- c("#E41A1C", "#377EB8", "#4DAF4A", "#984EA3", "#FF7F00", "#FFFF33", "#A65628")
So it's blending the first 4 colors or something like that and then using the maximum color. It doesn't happen on fake data, so it's obviously related to this particular dataset. Any ideas of how I can fix it?
CodePudding user response:
You need the breaks to be set at your maximum and minimum values, and you need 8 of them (the length of your palette 1). This will divide your colours into 7 groups with equal ranges.
To specify that you don't want the colors interpolated, you need to map them to the centre of each bin. You can use the values
argument for this, which maps values between 0 and 1 onto the scale. The first point should be halfway between 0 and 1/7, the second halfway between 1/7 and 2/7, etc.
Personally, I would probably use cut
on the col
values and use scale_fill_manual
, but here's how to do it with scale_fill_stepsn
ggplot(data)
geom_point(aes(x = x, y = y, fill = col), size = 10, pch = 21)
scale_fill_stepsn(colors = pal,
breaks = seq(min(data$col), max(data$col), length = 8),
values = seq(1/14, 13/14, length = 7))
EDIT
To calculate the values for any set of breaks correctly, you can use these two helper functions:
get_breaks <- function(vec, breaks) {
c(min(vec), breaks, max(vec))
}
get_values <- function(vec, breaks) {
breaks <- get_breaks(vec, breaks)
vals <- (diff(breaks)/2 head(breaks, -1) - min(breaks))/diff(range(vec))
vals[length(vals)] <- vals[length(vals)] .Machine$double.eps
vals
}
Which allow:
ggplot(data)
geom_point(aes(x = x, y = y, fill = col), size = 10, pch = 21)
scale_fill_stepsn(colors = pal,
breaks = get_breaks(data$col, 1:6 * 0.01),
values = get_values(data$col, 1:6 * 0.01))