Home > Net >  R: ggplot: how to assign a color to one variable that consistently appears across plots, and the res
R: ggplot: how to assign a color to one variable that consistently appears across plots, and the res

Time:11-10

If i have this plot, and I want to make a barchart,

df <-
    setNames(data.frame(
        as.POSIXct(
            c(
            "2022-07-29 00:00:00",
            "2022-07-29 00:00:05",
            "2022-07-29 00:05:00",
            "2022-07-29 00:05:05",
            "2022-07-29 00:10:00",
            "2022-07-29 00:15:00",
            "2022-07-29 00:20:00",
            "2022-07-29 00:20:05"
            )),
        c(1, 2, 3, 4, 5, 6, 7, 8),
        c("a", "a", "b847", "b317", "b317", "bob680", "bf456", "c3400")
    ),
    c("timeStamp", "value1", "text"))

df %>% ggplot(aes(x=fct_infreq(text), fill=text))   geom_bar(stat='count',aes(y=(..count..)/sum(..count..)))

how is it possible to assign "a" in "text", to ALWAYS be a certain color? The trick is, I cannot just create a palette and use

scale_fill_manual(values=palette). 

The reason for this is because "a" is the only figure in "text" that consistently shows up, each time new data comes in and I need to plot it. I tried to show this with the random character/number strings that make up the rest of "text". I want to always have "a" show up as a specific color, i.e. red, then all other incoming UNKNOWN (hence why i cant use a manual palette) variables being assigned any random color respectively.

Also, bonus for making the plot always show the top 5 most frequent variables in text besides "a". I know this is possible using something like

'%!in%' <- Negate('%in%')


df$text[df$text %!in% c("b847", "b317", "b317", "bob680", "bf456", "c3400")] <- "a"

as you can see, that is only possible AFTER I have run the plot, identified the top 5 manually, and then changing the base dataframe to assign all others simply as "a". I figure there must be a better, more automated way to do this, since this is definitely a common need in geom_bar charts. But I can't seem to figure out a solution.

Thanks

CodePudding user response:

Use relevel() or forcats::fct_relevel() so that "a" is always the first level; then provide an unnamed manual palette with red as the first value.

library(dplyr)
library(forcats)
library(ggplot2)
library(scales)

df %>% 
  mutate(
    text = fct_infreq(text),
    text_fill = fct_relevel(text, "a")
  ) %>% 
  ggplot(aes(x=text, fill=text_fill))   
  geom_bar(aes(y=(..count..)/sum(..count..)))  
  scale_fill_manual(values = c("red", brewer_pal(palette = "Set2")(8)))

Note, I created a separate text_fill so that the "a" column will still appear in the right place on the x axis if it has fewer values. If you don’t care about this and want it to always appear first, then just relevel text.

  • Related