Home > Software design >  Creating graph on ggplot2 with facte_wrap using multiple seperate variables
Creating graph on ggplot2 with facte_wrap using multiple seperate variables

Time:11-07

For my thesis I am using R-studio. I want to make a graph on ggplot2 with x= age(H2_lft) and y = IMT value (Mean_IMT_alg). I want to plot a graph with multiple variables(cardiovascular risk factors) to see the relationship between a certain variable/cardiovascular risk factor (e.g. smoking(H2_roken)/gender(H1_geslacht)/ethnicity(H1_EtnTotaal) and the IMT value on a certain age. First, I plotted multiple lines (each line represented a variable) in a graph. But I think this is a little too messy. I actually want to have multiple 'pannels/graphs' with x= age and y = IMT value. And in every graph I want to have a different variable.

I hope my explanation is clear enough and someone can help me :)

My first code (multiple lines in same plot) is:

t <- ggplot(data = Dataset, aes(x = H2_lft, y = MeanIMT_alg))   
  geom_smooth(se = FALSE, aes(group = H1_EtnTotaal, colour = H1_EtnTotaal))   
  geom_smooth(se = FALSE, aes(group = H2_Roken, colour = H2_Roken))   
  geom_smooth(se = FALSE, aes(group = H1_geslacht, colour = H1_geslacht))   
  stat_smooth(method = lm, se=FALSE)   
  theme_classic() 
t   labs(x = "Age (years)", y = "Mean IMT (mm)", title ="IMT", caption = "Figure 2: mean IMT", color = "cardiovascular risk factors", fil = "cardiovascular risk factors")

To accomplish multiple panels i used 'facet_wrap'. The problem however is that when using 'groups' in facet_Wrap, R makes groups that proceed on each other. But i want the groups to be unrelated of eachother. For example: I want one graph with a line for Marroccan ethnicity, one line with current smoking and one line with Male participants. I do not want a graph with: morroccan women that currently smoke or: Dutch men that never smoked. So, I want the graph with all the lines but split into several graphs.

The code that I used to accomplish this is:

t <- ggplot(data = Dataset, aes(x = H2_lft, y = MeanIMT_alg))   
  geom_smooth(se = FALSE, aes(group = H1_EtnTotaal, colour = H1_EtnTotaal))   
  geom_smooth(se = FALSE, aes(group = H2_Roken, colour = H2_Roken))   
  geom_smooth(se = FALSE, aes(group = H1_geslacht, colour = H1_geslacht))   
  stat_smooth(method = lm, se=FALSE)  
  facet_wrap(~H1_EtnTotaal   ~H2_Roken   ~H1_geslacht, scales = "free_y")   
  theme_classic() 
t   labs(x = "Age (years)", y = "Mean IMT (mm)", title ="IMT", caption = "Figure 2: mean IMT", color = "cardiovascular risk factors", fil = "cardiovascular risk factors")

CodePudding user response:

I think it might be generally easier to reshape the data to a long format for plotting with ggplot2. If you want seperate legends for each of the categories, you can use the {ggnewscale} package to do so. Is this (approximately) what you're looking for?

library(ggnewscale)
library(ggplot2)

# Dummy data
Dataset <- data.frame(
  H2_lft = runif(100, 18, 90),
  MeanIMT_alg = rnorm(100),
  H1_EtnTotaal = sample(LETTERS[1:5], 100, replace = TRUE),
  H2_Roken = sample(LETTERS[6:8], 100, replace = TRUE),
  H1_geslacht = sample(c("M", "F"), 100, replace = TRUE)
)

# Reshape data to long format
new <- tidyr::pivot_longer(Dataset, c(H1_EtnTotaal, H2_Roken, H1_geslacht))

ggplot(new, aes(H2_lft, MeanIMT_alg, group = value))  
  geom_smooth(
    data = ~ subset(.x, name == "H1_EtnTotaal"),
    aes(colour = value), 
    se = FALSE
  )  
  scale_colour_discrete(name = "EtnTotaal")  
  new_scale_colour()  
  geom_smooth(
    data = ~ subset(.x, name == "H1_geslacht"),
    aes(colour = value),
    se = FALSE
  )  
  scale_colour_discrete(name = "geslacht")  
  new_scale_colour()  
  geom_smooth(
    data = ~ subset(.x, name == "H2_Roken"),
    aes(colour = value),
    se = FALSE
  )  
  scale_colour_discrete(name = "Roken")  
  geom_smooth(
    method = lm, se = FALSE,
    aes(group = NULL)
  )  
  facet_wrap(~ name)
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
#> `geom_smooth()` using formula 'y ~ x'

Created on 2022-11-04 by the reprex package (v2.0.0)

CodePudding user response:

Thank you so much for your response. This is exactly what I wanted. You helped me a great deal :)

However, I still have one small question. Maybe you can awnser it as well... In your graphs I see a blue horizontal line on the x-axis. I wonder what this line represents. When I plot the graphs I am interested in, in every graph I get a blue line as well (not horizontal but somewhat linear) that does not represent the variable that I am interested in. For example: when plotting H1_geslacht (sex --> male/female) I see three lines instead of two. Can the blue line be the line for age (H2_lft)?

If so, is there a way to remove this line or to suppress the line from being visible? I hope you can help me with this question as well!

Kind regards, Marleen (and thanks again!)

  • Related