Home > Mobile >  R - Including two y-axis in ggplot
R - Including two y-axis in ggplot

Time:05-13

Could anyone help me to add a second y-axis in ggplot or alternatively combine the two separate ggplots I have made? (R-code attached)

The data: Dataframe = Deals1 include three columns (Year = Date, Number of transactions each year = N, and total transaction value each year = total_tvalue).

The dataset includes 22 rows (Year 2000-2021), number og transactions varies from 50-500 and transaction value varies from 100.000 to 800.000

Thanks!

# Two seperate plots
plot1 = ggplot(Deals1, aes(x=Date, y=N))   geom_bar(stat = "identity")

plot2 = ggplot(Deals1, aes(x=Date, y=total_tvalue, group = 1))   geom_line(stat = "identity")



# Doesnt work
ggplot(Deals1, aes(x=Date))   
  geom_bar( aes(y=N), stat = "identity")   
  geom_line( aes(y=total_tvalue)) 
  scale_y_continuous(
    name = "Number of transactions",
    sec_axis(name = "Transaction value")) 
  ggtitle("M&A Activity")

> dput(Deals1)
structure(list(Date = c("2000", "2001", "2002", "2003", "2004", 
"2005", "2006", "2007", "2008", "2009", "2010", "2011", "2012", 
"2013", "2014", "2015", "2016", "2017", "2018", "2019", "2020", 
"2021"), N = c(428L, 337L, 222L, 243L, 220L, 228L, 230L, 215L, 
146L, 143L, 131L, 94L, 121L, 128L, 154L, 161L, 156L, 139L, 159L, 
121L, 74L, 95L), total_tvalue = c(796728L, 283487L, 124839L, 
199670L, 276307L, 412632L, 379802L, 224635L, 188737L, 292432L, 
141469L, 244239L, 126452L, 173573L, 404071L, 564486L, 400689L, 
376499L, 477247L, 591219L, 262643L, 166189L)), row.names = c(NA, 
-22L), class = "data.frame")

CodePudding user response:

A secondary axis in ggplot is just an inert annotation drawn on to the side of the plot. It does not affect what is on the actual plot panel in any way.

In your case, if you plot both the bars and the line on the same panel, you can't see the bars because the line is 1,000 times larger than them.

To use a secondary axis here, we have to divide tvalue by about 1,000 so that it is on approximately the same scale as N. Of course, this means that anyone reading our chart would get the wrong numbers for tvalue if they look at our y axis. That's where a secondary axis comes in. We specify that the secondary axis shows numbers that are 1,000 times larger than they "really" are.

In addition, your plotting code needs a couple of other tweaks. At the moment it doesn't draw a line at all because the years are in character format rather than numeric, so you need to either use as.numeric(Date) or add a group = 1 to the aesthetic mapping. Secondly geom_bar(stat = "identity") is just a long way of writing geom_col

library(ggplot2)

ggplot(Deals1, aes(as.numeric(Date)))   
  geom_col(aes(y = N), fill = "deepskyblue4", alpha = 0.8)   
  geom_line(aes(y = total_tvalue / 1500), color = "orangered3", size = 1)  
  scale_y_continuous(
    name = "Number of transactions", breaks = 0:5 * 100,
    sec.axis = sec_axis(~.x * 1500, 
                        name = "Transaction value",
                        labels = function(x) {
                           paste0(scales::dollar(x/1000), "K")}))  
  ggtitle("M&A Activity")  
  theme_light(base_size = 16) 

enter image description here

  • Related