I have these datas that determines how absence rate affect student's grade in 3 different years, im trying to plot a one dimensional scatter plot with the three different years in the y-axis G1, G2 and G3 while absences being the x-axis. The one dimensional scatter plot should plot occurences of grade = 0 in specific absences value, something like the picture provided below.
my data:
structure(list(absences = c("6", "4", "10", "2", "4", "10", "0",
"6", "o", "0", "0", "4", "2", "2", "0", "4", "6", "4", "16",
"4"), G1 = c(5, 5, 7, 15, 6, 15, 12, 6, 16, 14, 10, 10, 14, 10,
14, 14, 13, 8, 6, 8), G2 = c(6, 5, 8, 14, 10, 15, 12, 5, 18,
15, 8, 12, 14, 10, 16, 14, 14, 10, 5, 10), G3 = c(6, 6, 10, 15,
10, 15, 11, 6, 19, 15, 9, 12, 14, 11, 16, 14, 14, 10, 5, 10)), row.names = c(NA,
-20L), spec = structure(list(cols = list(absences = structure(list(), class = c("collector_character",
"collector")), G1 = structure(list(), class = c("collector_double",
"collector")), G2 = structure(list(), class = c("collector_double",
"collector")), G3 = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), delim = "\t"), class = "col_spec"), problems = <pointer: 0x55e465b58110>, class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"))
CodePudding user response:
Something like this?
library(dplyr)
library(tidyr)
library(ggplot2)
set.seed(2022)
tibble(
absences = sample(c(0:16), 20, replace = TRUE),
G1 = sample(c(0:16), 20, replace = TRUE),
G2 = sample(c(0:16), 20, replace = TRUE),
G3 = sample(c(0:16), 20, replace = TRUE)
) %>%
pivot_longer(
cols = -absences,
names_to = "key",
values_to = "value"
) %>%
filter(key != 0) %>%
ggplot(aes(absences, key))
#geom_jitter(color = "red", height = 0.1)
geom_point(color = "red")
I didn't know if your grades and absences are distinct. if not, you can use the geom_jitter()
I quoted out.
Output:
CodePudding user response:
Here is a way how you could start to achieve your task:
library(tidyverse)
df %>%
pivot_longer(
-absences
) %>%
mutate(absences = as.numeric(replace(absences, absences == "o", "0"))) %>%
group_by(absences, name, value) %>%
summarise(absences = sum(absences, na.rm = TRUE)) %>%
ggplot(aes(x=name, y=factor(absences)))
geom_point(aes(size = value), color="red")
theme_minimal()
labs(title = "Your title", y ="Absences", x = "Year")
theme(legend.position = "bottom",
plot.title = element_text(hjust = 0.5))
guides(color= guide_legend(), size=guide_legend())
coord_flip()