For my project, we created a for loop/if else to assign a color for each of the five NYC boroughs using RColorBrewer. Here was my code for the for loop for reference. school.safety is my dataset.
color_vec<- vector(mode="character",nrow(school.safety))
table(school.safety$Borough)
borough <- unique(school.safety$Borough)
k <- length(borough)
bor_colors <- brewer.pal(k, "Set1")
for ( i in seq_len(nrow(school.safety))){
borough <- school.safety[, "Borough"]
if(borough[i] == "K"){
color_vec[i] <- bor_colors[1]
} else if (borough[i] == "M") {
color_vec[i] <- bor_colors[2]
} else if (borough[i]== "Q") {
color_vec[i] <- bor_colors[3]
} else if (borough [i]== "R") {
color_vec[i] <- bor_colors[4]
} else if (borough[i] == "X") {
color_vec[i] <- bor_colors[5]
} else {
color_vec[i] <- bor_colors[6]
}}
We are now using ggplot to create a barchart for the frequency of a particular incident by borough using the colors we assigned. Here is my code for the ggplot:
ggplot(school.safety, aes(school.safety$`Scanning Type`, fill=school.safety$Borough))
geom_bar(mapping = aes( color=color_vec, position="dodge", stat="identity"))
scale_fill_manual(values=c("Brooklyn"="#377EB8" ,"Manhattan"="#4DAF4A","Queens"="#984EA3","Staten Island"="#E41A1C", "Bronx"="#FF7F00"))
xlab("Scanning Type")
ylab("Count")
Here is what our barchart looks like now:
How can we fill in the bins with the assigned borough colors from the forloop and create a one legend for colors/boroughs. Additionally, if anyone knows how to not stack the barchart and create five seperate bins for each borough per scanning type.
Thanks so much
CodePudding user response:
The color vec is not needed, we do the mapping with a named vector in scale_fill_manual
.
boroughs = unique(school.safety$Borough)
bor_colors = brewer.pal(length(boroughs), "Set1")
names(bor_colors) = boroughs
## now bor_colors is a named vector where the names are boroughs
## and the values are the colors
ggplot(school.safety, aes(x = `Scanning Type`, fill = Borough))
## all the aesthetics at the top is usually nice
geom_bar(position = "dodge")
scale_fill_manual(values = borough_colors)
## give our named vector to the values
labs(x = "Scanning Type", y = "Count", fill = "Borough")
## labels all together is nice
You should use stat = "identity"
in geom_bar
when you already have a computed y value and are mapping a y aesthetic. You don't have y =
in your aesthetic, so I'm pretty sure you don't want stat = "identity"
(though that's just a guess since you haven't shared any sample data).
If your data frame borough
column has values K, M, Q, R, X instead of the real borough names, before running the above code I would create a new borough_name
column with the names you want. One way to do that would be making a lookup table and joining:
borough_lookup = data.frame(
borough = c("K", "M", "Q", "R", "X"),
borough_name = c("Brooklyn", "Manhattan", "Queens", "Staten Island", "Bronx")
)
school.safety = merge(school.safety, borough_lookup)
If needed, run this code to create the borough_name
column and then use borough_name
instead of borough
in all of the preceding code. (Creating the bor_colors
and the plotting code.)