Home > Mobile >  Specify scale_x_discrete for cut of continuous variabale
Specify scale_x_discrete for cut of continuous variabale

Time:10-12

I have a continuous time variable and I want to plot a heatmap of values. To do so using ggplot and geom_tile() I bin the time into time_bin and parse the bounds of the resulting intervals. My question is somewhat related to enter image description here

I would like the x axis to be labeled only using the first portion of the breaks (-0.03, 0.3, ...). I have been trying to use scale_x_discrete(breaks, labels) with different approaches but haven't made much progress. The x scale is long in the real data, so ideally it I would be able to do something like scale_x_discrete(breaks=scales::pretty_breaks(5)), but that's also not working.

I also tried using the low and high values that I parse from the cut, but that creates tile plots that contain vertical white lines everywhere.

Update

Using insight from one of the answers, I used factor(low) because my parsing works well, but the proposed readr::parse_number() does not. This gets the labels into the proper format.

The remaining portion of the question would be, how to show less factor levels using scale_x_discrete(breaks = ...) ?

For example, this seems to decimate the axis, a bit cumbersome but kinda works (from enter image description here

CodePudding user response:

A possible way to solve this, using insight from other answers. Not the prettiest but somewhat gets the job done. It might be a good idea to do a labeller function, like label_decimate() to produce the subset.

toy %>% 
  mutate(x = factor(low)) %>%
  ggplot()  
  geom_tile(aes(x=x, y=y, fill=value), color=NA) 
  geom_vline(aes(xintercept = unique(x[length(x)/2]))) 
  scale_x_discrete(breaks = function(x){x[c(rep(FALSE, 9), TRUE)]})

CodePudding user response:

A simple base-r solution is to gsub the left edge of the interval out and cbind this to the data argument of ggplot. A slightly different regex (^.*,([-.0-9] ).$) would allow you to extract the right edge.

ggplot(cbind(toy, "time" = gsub("^.([-.0-9] ),.*$", "\\1", toy$time_bin)))  
    geom_tile(aes(x=time, y=y, fill=value), color=NA) 
    geom_vline(aes(xintercept = unique(time[length(time)/2])))

enter image description here

However, this is misleading, as the x axis "looks" continuous, but is in fact a series of stacked factors which just happen to be in order and roughly contiguous. A better approach is to turn the x aes into a genuine numeric, which will also let you specify your scale breaks more easily.

ggplot(cbind(toy, "time" = as.numeric(gsub("^.([-.0-9] ),.*$", "\\1", toy$time_bin))))  
 geom_tile(aes(x=time, y=y, fill=value), color=NA) 
 geom_vline(aes(xintercept = unique(time[length(time)/2]))) 

enter image description here

Note however that this will leave gaps where the tile sizes don't quite match (the distance between -0.030 and 0.300 is bigger than that between 0.300 and 0.601, and this is correctly shown here). If this is true in your real data, you can either fudge the width of the geom_tile, or use geom_rect and specify xmax and xmin from the left and right edges of the cut interval.

ggplot(cbind(toy, 
             "time_l" = as.numeric(gsub("^.([-.0-9] ),.*$", "\\1", toy$time_bin)),
             "time_r" = as.numeric(gsub("^.*,([-.0-9] ).$", "\\1", toy$time_bin)),
             "y_l" = as.numeric(toy$y) - 0.5, 
             "y_r" = as.numeric(toy$y)   0.5
             ))  
  geom_rect(aes(xmin=time_l, xmax=time_r, ymin=y_l, ymax=y_r, fill=value), color=NA) 
  geom_vline(aes(xintercept = unique(time_l[length(time_l)/2]))) 

enter image description here

  • Related