Home > other >  ggplot fill in colors for barchart
ggplot fill in colors for barchart

Time:10-27

For my project, we created a for loop/if else to assign a color for each of the five NYC boroughs using RColorBrewer. Here was my code for the for loop for reference. school.safety is my dataset.

color_vec<-  vector(mode="character",nrow(school.safety))

table(school.safety$Borough)

borough <- unique(school.safety$Borough)
k <- length(borough)
bor_colors <- brewer.pal(k, "Set1")

for ( i in seq_len(nrow(school.safety))){
  borough <- school.safety[, "Borough"]
   if(borough[i] == "K"){
    color_vec[i] <- bor_colors[1]
 } else if (borough[i] == "M") {
    color_vec[i] <- bor_colors[2]
 } else if (borough[i]== "Q") {
    color_vec[i] <- bor_colors[3]
  } else if (borough [i]== "R") {
    color_vec[i] <- bor_colors[4]
  } else if (borough[i] == "X") {
    color_vec[i] <- bor_colors[5]
    } else {
    color_vec[i] <- bor_colors[6]
  }} 

We are now using ggplot to create a barchart for the frequency of a particular incident by borough using the colors we assigned. Here is my code for the ggplot:

ggplot(school.safety, aes(school.safety$`Scanning Type`, fill=school.safety$Borough))   
geom_bar(mapping = aes( color=color_vec, position="dodge", stat="identity"))   
scale_fill_manual(values=c("Brooklyn"="#377EB8" ,"Manhattan"="#4DAF4A","Queens"="#984EA3","Staten Island"="#E41A1C", "Bronx"="#FF7F00"))  
xlab("Scanning Type") 
 ylab("Count")

Here is what our barchart looks like now: enter image description here

How can we fill in the bins with the assigned borough colors from the forloop and create a one legend for colors/boroughs. Additionally, if anyone knows how to not stack the barchart and create five seperate bins for each borough per scanning type.

Thanks so much

CodePudding user response:

The color vec is not needed, we do the mapping with a named vector in scale_fill_manual.

boroughs = unique(school.safety$Borough)
bor_colors = brewer.pal(length(boroughs), "Set1")
names(bor_colors) = boroughs
## now bor_colors is a named vector where the names are boroughs
## and the values are the colors


ggplot(school.safety, aes(x = `Scanning Type`, fill = Borough))   
    ## all the aesthetics at the top is usually nice  
  geom_bar(position = "dodge")   
  scale_fill_manual(values = borough_colors)  
    ## give our named vector to the values
  labs(x = "Scanning Type", y = "Count", fill = "Borough")
    ## labels all together is nice

You should use stat = "identity" in geom_bar when you already have a computed y value and are mapping a y aesthetic. You don't have y = in your aesthetic, so I'm pretty sure you don't want stat = "identity" (though that's just a guess since you haven't shared any sample data).

If your data frame borough column has values K, M, Q, R, X instead of the real borough names, before running the above code I would create a new borough_name column with the names you want. One way to do that would be making a lookup table and joining:

borough_lookup = data.frame(
  borough = c("K", "M", "Q", "R", "X"),
  borough_name = c("Brooklyn", "Manhattan", "Queens", "Staten Island", "Bronx")
)

school.safety = merge(school.safety, borough_lookup)

If needed, run this code to create the borough_name column and then use borough_name instead of borough in all of the preceding code. (Creating the bor_colors and the plotting code.)

  • Related