Home > Enterprise >  How to replicate a picture with histogram and CDF function in one picture?
How to replicate a picture with histogram and CDF function in one picture?

Time:05-31

I'm trying to replicate this figure with my own data using ggplot2. enter image description here

In my case, I don't care about the colors and legend at the moment, I just need to show the number of new repositories during a one year period and the cdf (Cumulative distribution function) of the creation date (with a dual y axis, positioning the 1.0 value in the upper right corner, as in the example)

So far I got this code:

repo.count <- read.csv("http://pastebin.com/raw.php?i=PfUgyrt0",sep=",")
repo.count$createdAt <- as.Date(repo.count$createdAt)

ggplot(data = repo.count, aes(x = createdAt))  
  geom_histogram(colour = 1, fill = "white", position="identity")  
  xlab('Date')  
  ylab('# New Repositories')  
  ggtitle('') 
  scale_x_date(labels = scales::date_format("%Y-%b"),
             breaks = by_month(date.groups$createdAt,2))   
  scale_y_continuous(sec.axis = sec_axis(~. / max(repo.count$total), 
                                         name = "Cumulative distribution"))  
  theme(axis.text.x = element_text(angle=90))   stat_ecdf(size=1)

But it's far from what I'd like to achieve:

enter image description here.

CodePudding user response:

I wonder if you are perhaps not looking for a histogram as such, but just a bar plot of new repos per day and a line indicating cumulative repos? If so, use geom_col instead of geom_histogram, and specify the y variable as repositories

ggplot(data = repo.count, aes(x = createdAt, y = repositories))  
  geom_col(fill = "green4", color = NA)  
  geom_step(aes(y = total / 100))  
  xlab('Date')  
  ylab('# New Repositories')  
  ggtitle('') 
  scale_x_date(labels = scales::date_format("%Y-%b"),
               date_breaks = 'month')   
  scale_y_continuous(sec.axis = sec_axis(~. * 100, 
                                         name = "Cumulative repos"))  
  theme_classic(base_size = 16)  
  theme(axis.text.x = element_text(angle = 90)

enter image description here

  • Related