Home > OS >  How to create a line graph of 2 variables for 3 entities through the years?
How to create a line graph of 2 variables for 3 entities through the years?

Time:12-24

I have an excel with data of 3 countries, and their values of Marriages and Divorces from 1960-2019 and i need to create a graph with y=value of each variable for each country through the years, I have tried doing this but I can't seem to make it work and I'm not sure what I'm doing wrong. (I have to use ggplot, it's a class requirement)

library(ggplot2)
Anos <- factor(CASDIV$Ano)
names(CASDIV)



colors <- c("Casamentos: Croácia" = "Blue", "Casamentos: Irlanda" = "Orange",
            "Casamentos: Malta" = "Yellow", "Divórcios: Croácia" = "red",
            "Divórcios: Irlanda" = "Green", "Divórcios: Malta" = "Brown")

ggplot(CASDIV, aes(x= Ano)) 
  geom_line(data=subset(CASDIV, País== "HR - Croácia"), aes(y=CASDIV$Casamentos, color = "Casamentos : Croácia"), size = 0,01) 
  geom_line(data=subset(CASDIV, País== "IE - Irlanda"), aes(y=CASDIV$Casamentos, color = "Casamentos : Irlanda"), size=0,01) 
  geom_line(data = subset(CASDIV, País=="MT - Malta"), aes(y=CASDIV$Casamentos, color = "casamentos: Malta"), size=0,01) 
  geom_line(data=subset(CASDIV, País== "HR - Croácia"), aes(y=CASDIV$Divórcios, color = "Divórcios : Croácia"), size=0,01) 
  geom_line(data=subset(CASDIV, País == "IE - Irlanda"), aes(y=CASDIV$Divórcios, color = "Divórcios : Irlanda"), size=0,01) 
  geom_line(data=subset(CASDIV, País=="MT - Malta"), aes(y=CASDIV$Divórcios, color = "Divórcios : Malta"), size=0,01) 
  labs(x="Anos", y= "Valor", Colour = "Legenda")  
  scale_color_manual(values= colors)

MRE:

2017,HR - Croácia,20310,6265
2018,HR - Croácia,19921,6125
2019,HR - Croácia,19761,5936
2017,IE - Irlanda,22021,0
2018,IE - Irlanda,21053,0
2019,IE - Irlanda,20313,0
2016,MT - Malta,3034,371
2017,MT - Malta,2934,312
2018,MT - Malta,2831,349
2019,MT - Malta,2674,354

CodePudding user response:

There are several issues with your code. First, as I already mentioned in my comment you use aes(y = CASDIV$.. in your code which is not recommended and which in your case will result in an error. Second: You use a comma as decimal separator in size=0,01 which is the reason for the mysterious Error: stat must be either a string or a Stat object, not a numeric vector" notification. Always use . as the decimal mark. Finally, while you took the right approach for the colors you have to make sure that the labels you use inside aes() are the same as in your color vector.

Note: A size of 0.01 makes the lines nearly invisible so I switched to 0.1.

Using some fake random data to mimic your real data:

library(ggplot2)

set.seed(123)

CASDIV <- data.frame(
  Ano = seq(1960, 2020, 10),
  País = rep(c("HR - Croácia", "IE - Irlanda", "MT - Malta"), each = 7),
  Casamentos = runif(21),
  Divórcios = runif(21)
)
Anos <- factor(CASDIV$Ano)

colors <- c("Casamentos: Croácia" = "Blue", "Casamentos: Irlanda" = "Orange",
            "Casamentos: Malta" = "Yellow", "Divórcios: Croácia" = "red",
            "Divórcios: Irlanda" = "Green", "Divórcios: Malta" = "Brown")

ggplot(CASDIV, aes(x= Ano)) 
  geom_line(data=subset(CASDIV, País== "HR - Croácia"), aes(y=Casamentos, color = "Casamentos: Croácia"), size = 0.1) 
  geom_line(data=subset(CASDIV, País== "IE - Irlanda"), aes(y=Casamentos, color = "Casamentos: Irlanda"), size=0.1) 
  geom_line(data = subset(CASDIV, País=="MT - Malta"), aes(y=Casamentos, color = "Casamentos: Malta"), size=0.1) 
  geom_line(data=subset(CASDIV, País== "HR - Croácia"), aes(y=Divórcios, color = "Divórcios: Croácia"), size=0.1) 
  geom_line(data=subset(CASDIV, País == "IE - Irlanda"), aes(y=Divórcios, color = "Divórcios: Irlanda"), size=0.1) 
  geom_line(data=subset(CASDIV, País=="MT - Malta"), aes(y=Divórcios, color = "Divórcios: Malta"), size=0.1) 
  labs(x="Anos", y= "Valor", Colour = "Legenda")  
  scale_color_manual(values= colors)

While your code works, it is probably not the most efficient. Using some data wrangling you could simplify the plotting code considerably:

library(tidyr)
library(dplyr)

CASDIV_long <- CASDIV %>% 
  pivot_longer(-c(Ano, País)) %>% 
  mutate(color = paste(name, substring(País, 5), sep = ":"))

ggplot(CASDIV_long, aes(x= Ano))   
  geom_line(aes(y = value, color = color), size = .1)  
  labs(x="Anos", y= "Valor", Colour = "Legenda")  
  scale_color_manual(values= colors)

Created on 2021-12-23 by the reprex package (v2.0.1)

  • Related