Home > OS >  Giving Specific Breaks in ggplot
Giving Specific Breaks in ggplot

Time:03-08

I have CSV data where I have two columns and I need to plot Time as x axis and count as y axis. The time in the data ranges from 2008 Sep to 2021 Dec. Data points are monthly.

This is my enter image description here

I want to put 5 certain time points in the x axis like below:

enter image description here

This is what I tried:

library(ggplot2)
theme_set(
  theme_bw(),
  theme(legend.position = "top")
)

result <- read.csv("Downloads/Questions Trend - Questions Trend.csv")
p <- ggplot(result, aes(result$Time_Formatted, result$VCS_Feature_History_Sanitize_Trnd))   
     geom_point(size = 0.1)   xlab("Month")   ylab("Temporal Trend")
p   geom_smooth(method = "loess", color = "red")

I tried below and could remove some points but still can not customize to specific points.

library(ggplot2)
library(scales)
theme_set(
  theme_bw(),
  theme(legend.position = "top")
)

result <- read.csv("Downloads/Questions Trend - Questions Trend.csv")
result$Time_Formatted <- as.Date(result$Time_Formatted)
p <- ggplot(result, aes(result$Time_Formatted, result$VCS_Feature_History_Sanitize_Trnd))   
     geom_point(size = 0.1)   xlab("Month")   ylab("Temporal Trend")  
    scale_x_date(date_breaks = "years" , date_labels = "%b-%y")
p   geom_smooth(method = "loess", color = "red")

enter image description here

How to give specific points in the x axis?

CodePudding user response:

You should use the scale_x_date() function of the ggplot2 package.

For example, here a code I'm always using in my work when I need to plot time data :

ggplot2::scale_x_date(name   = " ",
                        breaks = function(date) seq.Date(from = lubridate::ymd("2020-01-01") 1, 
                                                         to = lubridate::today(), 
                                                         by = "1 month"),
                        limits = c(lubridate::ymd("2020-07-13"),
                                   lubridate::today()),
                        expand  = c(0,0),
                        labels = scales::date_format("%b %Y"))

With breaks, you can choose to only show the first date of each month.

With labels and the function date_format() from the scales package, you can choose the date format, and you can basically do anything you want. Here, I choose to plot the month in letters and the year in number.

CodePudding user response:

Most of your problems here are related to reading in the date data so that the format is correctly recognised - this can be done by specifying it explicitly:

result <- read.csv("Test.csv")
result$Time_Formatted <- as.Date(result$Time_Formatted, "%m/%d/%y")

Then it's simply a case of making a vector indicating where you want breaks and specifying this in the scale_x_date with:

date_breaks <- as.Date(c("1/7/10", "1/12/12", "1/1/14", "1/2/15", "1/3/16"), "%d/%m/%y")

p <- ggplot(result, aes(Time_Formatted, VCS_Feature_History_Sanitize_Trnd))   
      geom_point(size = 0.1)   xlab("Month")   ylab("Temporal Trend")  
      scale_x_date(breaks=date_breaks , date_labels = "%b-%y")
p   geom_smooth(method = "loess", color = "red")

Note that I've removed the explicit reference to "result" in the aes() function as that's unnecessary and depricated according to the warning it creates. The end result is:

Output plot

  • Related