For my thesis I am using R-studio. I want to make a graph on ggplot2 with x= age(H2_lft) and y = IMT value (Mean_IMT_alg). I want to plot a graph with multiple variables(cardiovascular risk factors) to see the relationship between a certain variable/cardiovascular risk factor (e.g. smoking(H2_roken)/gender(H1_geslacht)/ethnicity(H1_EtnTotaal) and the IMT value on a certain age. First, I plotted multiple lines (each line represented a variable) in a graph. But I think this is a little too messy. I actually want to have multiple 'pannels/graphs' with x= age and y = IMT value. And in every graph I want to have a different variable.
I hope my explanation is clear enough and someone can help me :)
My first code (multiple lines in same plot) is:
t <- ggplot(data = Dataset, aes(x = H2_lft, y = MeanIMT_alg))
geom_smooth(se = FALSE, aes(group = H1_EtnTotaal, colour = H1_EtnTotaal))
geom_smooth(se = FALSE, aes(group = H2_Roken, colour = H2_Roken))
geom_smooth(se = FALSE, aes(group = H1_geslacht, colour = H1_geslacht))
stat_smooth(method = lm, se=FALSE)
theme_classic()
t labs(x = "Age (years)", y = "Mean IMT (mm)", title ="IMT", caption = "Figure 2: mean IMT", color = "cardiovascular risk factors", fil = "cardiovascular risk factors")
To accomplish multiple panels i used 'facet_wrap'. The problem however is that when using 'groups' in facet_Wrap, R makes groups that proceed on each other. But i want the groups to be unrelated of eachother. For example: I want one graph with a line for Marroccan ethnicity, one line with current smoking and one line with Male participants. I do not want a graph with: morroccan women that currently smoke or: Dutch men that never smoked. So, I want the graph with all the lines but split into several graphs.
The code that I used to accomplish this is:
t <- ggplot(data = Dataset, aes(x = H2_lft, y = MeanIMT_alg))
geom_smooth(se = FALSE, aes(group = H1_EtnTotaal, colour = H1_EtnTotaal))
geom_smooth(se = FALSE, aes(group = H2_Roken, colour = H2_Roken))
geom_smooth(se = FALSE, aes(group = H1_geslacht, colour = H1_geslacht))
stat_smooth(method = lm, se=FALSE)
facet_wrap(~H1_EtnTotaal ~H2_Roken ~H1_geslacht, scales = "free_y")
theme_classic()
t labs(x = "Age (years)", y = "Mean IMT (mm)", title ="IMT", caption = "Figure 2: mean IMT", color = "cardiovascular risk factors", fil = "cardiovascular risk factors")
CodePudding user response:
I think it might be generally easier to reshape the data to a long format for plotting with ggplot2. If you want seperate legends for each of the categories, you can use the {ggnewscale} package to do so. Is this (approximately) what you're looking for?
library(ggnewscale)
library(ggplot2)
# Dummy data
Dataset <- data.frame(
H2_lft = runif(100, 18, 90),
MeanIMT_alg = rnorm(100),
H1_EtnTotaal = sample(LETTERS[1:5], 100, replace = TRUE),
H2_Roken = sample(LETTERS[6:8], 100, replace = TRUE),
H1_geslacht = sample(c("M", "F"), 100, replace = TRUE)
)
# Reshape data to long format
new <- tidyr::pivot_longer(Dataset, c(H1_EtnTotaal, H2_Roken, H1_geslacht))
ggplot(new, aes(H2_lft, MeanIMT_alg, group = value))
geom_smooth(
data = ~ subset(.x, name == "H1_EtnTotaal"),
aes(colour = value),
se = FALSE
)
scale_colour_discrete(name = "EtnTotaal")
new_scale_colour()
geom_smooth(
data = ~ subset(.x, name == "H1_geslacht"),
aes(colour = value),
se = FALSE
)
scale_colour_discrete(name = "geslacht")
new_scale_colour()
geom_smooth(
data = ~ subset(.x, name == "H2_Roken"),
aes(colour = value),
se = FALSE
)
scale_colour_discrete(name = "Roken")
geom_smooth(
method = lm, se = FALSE,
aes(group = NULL)
)
facet_wrap(~ name)
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
#> `geom_smooth()` using formula 'y ~ x'
Created on 2022-11-04 by the reprex package (v2.0.0)
CodePudding user response:
Thank you so much for your response. This is exactly what I wanted. You helped me a great deal :)
However, I still have one small question. Maybe you can awnser it as well... In your graphs I see a blue horizontal line on the x-axis. I wonder what this line represents. When I plot the graphs I am interested in, in every graph I get a blue line as well (not horizontal but somewhat linear) that does not represent the variable that I am interested in. For example: when plotting H1_geslacht (sex --> male/female) I see three lines instead of two. Can the blue line be the line for age (H2_lft)?
If so, is there a way to remove this line or to suppress the line from being visible? I hope you can help me with this question as well!
Kind regards, Marleen (and thanks again!)