I have a dataset with three columns. One is Player; the second is Run_Value; the third is Num_years. I want Player to appear on the Y axis, Run_Value as a bar on the X axis, and Num_Years to appear either as a smaller bar within the Run_Value bar or next to it. Below is how my bar chart is currently appearing. The Num_Years data is not displaying, which is what I need.
In addition, if at the end of each Run_Value bar its Run_Value could appear, that would be great.
I have searched Stackoverflow for a comparable question, but could not find one. If one exists, please share it link.
--I cannot see the two uploaded images on my screen and am not sure why.
Dplyr Code:
sp2358test <- spAll |>
filter(RunValue < 0) |>
select(Player, RunValue) |>
add_count(Player, name="Num_years") |>
filter(Num_years %in% c(2:8)) |>
arrange(Num_years, Player) |>
group_by(Player, Num_years) |>
summarise(Run_Value = sum(RunValue))
My ggplot2 code
sp2358test |>
ggplot(aes(fill=Run_Value, y=Player, x=Run_Value))
geom_bar(position='dodge', stat='identity')
lims(x = c(-250,0))
guides(fill=guide_legend(title="Legend"))
ggtitle("Mets Starters with Top Run Value Totals",
subtitle = "Data from Statcast for 2008-2021")
theme(plot.title = element_text(size = 14, color = "black"),
plot.subtitle = element_text(size = 11, color = "purple"),
text=element_text(color="blue"),
axis.text=element_text(color="black"),
legend.title = element_text(color = "black", size = 11)
)
Results of dput for spAll dataframe:
structure(list(Player = c("deGrom, Jacob", "Dickey, R.A.", "Gee, Dillon",
"Harvey, Matt", "Lugo, Seth", "Maine, John", "Matz, Steven",
"Niese, Jonathon", "Pelfrey, Mike", "Santana, Johan", "Syndergaard, Noah",
"Wheeler, Zack"), Num_years = c(8L, 3L, 3L, 3L, 3L, 2L, 2L, 2L,
3L, 3L, 5L, 3L), Run_Value = c(-240.4, -56.3, -11.2, -70.3, -8.5,
-8.1, -16, -11.8, -20.7, -87.8, -77.5, -43.1)), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -12L), groups = structure(list(
Player = c("deGrom, Jacob", "Dickey, R.A.", "Gee, Dillon",
"Harvey, Matt", "Lugo, Seth", "Maine, John", "Matz, Steven",
"Niese, Jonathon", "Pelfrey, Mike", "Santana, Johan", "Syndergaard, Noah",
"Wheeler, Zack"), .rows = structure(list(1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -12L), .drop = TRUE))
CodePudding user response:
You may want to avoid having two x-axis. Consider the alternative of using facet_wrap. Note you first need to put your data into a long format. I've used df for your dataset.
library(ggplot2)
library(dplyr)
library(tidyr)
df2 <- df %>%
pivot_longer(cols = c(Num_years, Run_Value), names_to = "metric")
df2 %>%
ggplot()
geom_col(aes(y=Player, x = value, fill = value))
facet_wrap(~rev(metric),scales = "free_x")
ggtitle("Mets Starters with Top Run Value Totals",
subtitle = "Data from Statcast for 2008-2021")
theme(plot.title = element_text(size = 14, color = "black"),
plot.subtitle = element_text(size = 11, color = "purple"),
text=element_text(color="blue"),
axis.text=element_text(color="black"),
legend.title = element_text(color = "black", size = 11)
)
It gets you to something that looks like this. The problem is that you have two variables that are using the fill scale and those two variables have very different values. What is your intention?
CodePudding user response:
Here's another that is closer to your original and what you describe as you want.
library(ggplot2)
library(dplyr)
library(tidyr)
df2 <- df %>%
pivot_longer(cols = c(Num_years, Run_Value), names_to = "metric")
df2 %>%
ggplot()
geom_bar(aes(y=Player, x = value, group = metric, fill = metric), stat='identity',
position = "dodge")
ggtitle("Mets Starters with Top Run Value Totals",
subtitle = "Data from Statcast for 2008-2021")
theme(plot.title = element_text(size = 14, color = "black"),
plot.subtitle = element_text(size = 11, color = "purple"),
text=element_text(color="blue"),
axis.text=element_text(color="black"),
legend.title = element_text(color = "black", size = 11)
)
You have positive numbers in one variable and negative in the other. It was important to omit the xlim otherwise it would remove all the positive values of the Num-years variable.