Home > Net >  In R how do I lable the fill (or group) of a geom_line graph with multiple lines?
In R how do I lable the fill (or group) of a geom_line graph with multiple lines?

Time:08-12

I have the following code, based on a dataframe that has rows for each year and one of six newspapers, and then two columns with numbers.
For example:
year|medium_name|n_transsexu|n_transgende
2012|New York Times|15|22
2012|Daily Mail|10|20
2013|New York Times|10|12
etc.

With the following code, I want to make a geom_line graph showing the change of the two number columns over time, but sepeate for each medium:

yeardf_medium %>%
  ggplot(aes(x=year, group = medium_name))  
    geom_line(aes(y = n_transsexu, col = "transsexuell etc."))  
    geom_line(aes(y = n_transgende, col = "transgender etc."))  
    labs(title = "Nennung spezifischer Schlüsselwörter über Zeit (nach Medium)",
         subtitle = "transsexuell/Transsexualismus etc. vs. transgender/Transgenderismus etc",
         x = "Jahr",
         y = "Anzahl Nennungen")

It results into the following graph: graph with no lables for group lines

As you can see, the graph shows different lines for the mediums (as I intended with group = medium_name; it would also work with fill = medium_name). But I can't differentiate between the different mediums, as the colors are just changed based on the columns with the numbers.
How can I add a legend for the group (or fill)?

CodePudding user response:

The following solutions are based on this generated data:

year <- rep(c(1995:2022), each = 3)
medium_name <- rep(c("New York Times", "Daily Mail", "News Paper"), times = 28)
n_transsexu <- abs(year-1995   sample(c(-20:-1), 84, replace = TRUE))
n_transgende <- abs(year-1995    sample(c(-20:-1), 84, replace = TRUE))

yeardf_medium <- data.frame(year, medium_name, n_transsexu, n_transgende)

The simpelst solution to color for medium_name is to change "group = medium_name" to "col = medium_name" and to remove the two "col = " statements in your geom_line():

yeardf_medium %>%
      ggplot(aes(x=year, col = medium_name))  
      geom_line(aes(y = n_transsexu))  
      geom_line(aes(y = n_transgende))  
      labs(title = "Nennung spezifischer Schlüsselwörter über Zeit (nach Medium)",
           subtitle = "transsexuell/Transsexualismus etc. vs. transgender/Transgenderismus etc",
           x = "Jahr",
           y = "Anzahl Nennungen")

Graph with colors for every Medium

However, this graph looks really messy and you cannot differentiate between your keywords (transgender vs. transsexual). You could achieve this by adding different line types for each keyword. To do this I suggest you to transform your dataframe into long-format with the function pivot_longer(). Generally, if you are working with ggplot2 I recommend to use your data in long-format.

yeardf_medium %>%
  pivot_longer(c(n_transsexu,n_transgende), names_to = "keyword", values_to = "n") %>%
  ggplot(aes(x = year, y = n)) 
  geom_line(aes(col = medium_name, linetype = keyword), size = 1.5) 
  labs(title = "Nennung spezifischer Schlüsselwörter über Zeit (nach Medium)",
       subtitle = "transsexuell/Transsexualismus etc. vs. transgender/Transgenderismus etc",
       x = "Jahr",
       y = "Anzahl Nennungen")

Graph with different line types for every Medium

However, this still looks quite messy. Therefore, I would recommend to use facet_wrap() to split up your plot for either each keyword or each medium:

yeardf_medium %>%
  pivot_longer(c(n_transsexu,n_transgende), names_to = "keyword", values_to = "n") %>%
  ggplot(aes(x = year, y = n)) 
  geom_line(aes(col = medium_name), size = 1.5) 
  labs(title = "Nennung spezifischer Schlüsselwörter über Zeit (nach Medium)",
       subtitle = "transsexuell/Transsexualismus etc. vs. transgender/Transgenderismus etc",
       x = "Jahr",
       y = "Anzahl Nennungen") 
  facet_wrap(~keyword)

Graph split up for each keyword

yeardf_medium %>%
  pivot_longer(c(n_transsexu,n_transgende), names_to = "keyword", values_to = "n") %>%
  ggplot(aes(x = year, y = n)) 
  geom_line(aes(col = keyword), size = 1.5) 
  labs(title = "Nennung spezifischer Schlüsselwörter über Zeit (nach Medium)",
       subtitle = "transsexuell/Transsexualismus etc. vs. transgender/Transgenderismus etc",
       x = "Jahr",
       y = "Anzahl Nennungen") 
  facet_wrap(~medium_name)

Graph split up for each medium

  • Related