ggplot fill in colors for barchart-CodePudding

For my project, we created a for loop/if else to assign a color for each of the five NYC boroughs using RColorBrewer. Here was my code for the for loop for reference. school.safety is my dataset.

color_vec<-  vector(mode="character",nrow(school.safety))

table(school.safety$Borough)

borough <- unique(school.safety$Borough)
k <- length(borough)
bor_colors <- brewer.pal(k, "Set1")

for ( i in seq_len(nrow(school.safety))){
  borough <- school.safety[, "Borough"]
   if(borough[i] == "K"){
    color_vec[i] <- bor_colors[1]
 } else if (borough[i] == "M") {
    color_vec[i] <- bor_colors[2]
 } else if (borough[i]== "Q") {
    color_vec[i] <- bor_colors[3]
  } else if (borough [i]== "R") {
    color_vec[i] <- bor_colors[4]
  } else if (borough[i] == "X") {
    color_vec[i] <- bor_colors[5]
    } else {
    color_vec[i] <- bor_colors[6]
  }}

We are now using ggplot to create a barchart for the frequency of a particular incident by borough using the colors we assigned. Here is my code for the ggplot:

ggplot(school.safety, aes(school.safety$`Scanning Type`, fill=school.safety$Borough))   
geom_bar(mapping = aes( color=color_vec, position="dodge", stat="identity"))   
scale_fill_manual(values=c("Brooklyn"="#377EB8" ,"Manhattan"="#4DAF4A","Queens"="#984EA3","Staten Island"="#E41A1C", "Bronx"="#FF7F00"))  
xlab("Scanning Type") 
 ylab("Count")

Here is what our barchart looks like now:

How can we fill in the bins with the assigned borough colors from the forloop and create a one legend for colors/boroughs. Additionally, if anyone knows how to not stack the barchart and create five seperate bins for each borough per scanning type.

Thanks so much

CodePudding user response：

The color vec is not needed, we do the mapping with a named vector in scale_fill_manual.

boroughs = unique(school.safety$Borough)
bor_colors = brewer.pal(length(boroughs), "Set1")
names(bor_colors) = boroughs
## now bor_colors is a named vector where the names are boroughs
## and the values are the colors


ggplot(school.safety, aes(x = `Scanning Type`, fill = Borough))   
    ## all the aesthetics at the top is usually nice  
  geom_bar(position = "dodge")   
  scale_fill_manual(values = borough_colors)  
    ## give our named vector to the values
  labs(x = "Scanning Type", y = "Count", fill = "Borough")
    ## labels all together is nice

You should use stat = "identity" in geom_bar when you already have a computed y value and are mapping a y aesthetic. You don't have y = in your aesthetic, so I'm pretty sure you don't want stat = "identity" (though that's just a guess since you haven't shared any sample data).

If your data frame borough column has values K, M, Q, R, X instead of the real borough names, before running the above code I would create a new borough_name column with the names you want. One way to do that would be making a lookup table and joining:

borough_lookup = data.frame(
  borough = c("K", "M", "Q", "R", "X"),
  borough_name = c("Brooklyn", "Manhattan", "Queens", "Staten Island", "Bronx")
)

school.safety = merge(school.safety, borough_lookup)

If needed, run this code to create the borough_name column and then use borough_name instead of borough in all of the preceding code. (Creating the bor_colors and the plotting code.)