I am working with a dataset that looks similar to the sample_df below:
sample_df <- data.frame(
Date = c("2018-01-01", "2018-01-02", "2018-01-03", "2018-01-04", "2018-01-05"),
A = c(3304, 3223, 3138, 3090, 3088),
B = c(5951, 5972, 5981, 5957, 5973),
C = c(1629, 1592, 1578, 1566, 1577),
D = c(2380, 2401, 2408, 2402, 2399)
)
I want to see if there is a relationship between column A, B, and D by plotting a line graph with with three lines representing A, B, and D columns (we are gonna ignore column C for now). Note that the Y-axis should represent the "Date and the X-axis should be the unit of column A, B, and C. The graph should also have a legend and a way to distinguish three lines.
I reckon that I am gonna have to use "ggplot" but I can only create three separate line graphs. So it would be great if you could teach me a way to combine all three on a same plot.
Thank you!
CodePudding user response:
The easier solution is to covert from the current format to a "long" formatted data frame using the pivot_longer()
function from the tidyr package.
With the variable names in a single column, ggplot makes quick work plotting the values.
library(dplyr)
library(tidyr)
long_df <- sample_df %>% pivot_longer(-Date, names_to = "Category", values_to = "values")
library(ggplot2)
ggplot(long_df, aes(x=Date, y=values, color= Category, group=Category ))
geom_line()