Home > Blockchain >  Secondary axis in percentage ggplot2
Secondary axis in percentage ggplot2

Time:03-18

frineds. Can you please help me to add the secondary axis in percentage, please. I tried differen solution, without any results.

year <- c("2010","2011","2012","2013","2014","2015","2016","2017","2018","2019","2020","2010","2011","2012","2013","2014","2015","2016","2017","2018","2019","2020")
sex <- c("male","male","male","male","male","male","male","male","male","male","male","female","female","female","female","female","female","female","female","female","female","female")
number <- c(12,5,15,5,4,6,5,5,8,4,0,54,64,70,50,54,35,40,37,42,36,24)
perFailed <- c(0.090909091,0.057971014,0.117647059,0.163636364,0.068965517,0.048780488,0.044444444,0.071428571, 0.1,0.1,0)
data <- data.frame(year,sex,number,perFailed)
data$year <- as.factor(data$year)
library(ggplot2)
ggplot(data=data, aes(x=year, y=number, fill=sex))  
  geom_bar(stat="identity", position=position_dodge())  
  theme_bw()

I do know how, but R duplicated my last column (perFailed). Those data needs to be ploted as geom line on the secondary axis. Thanks all.

plot

CodePudding user response:

Try this:

ratio <- max(data$number) / max(data$perFailed)
data <- transform(data, perFailedScaled = perFailed * ratio)
head(data)
#   year  sex number  perFailed Year perFailedScaled
# 1 2010 male     12 0.09090909 2010        38.88889
# 2 2011 male      5 0.05797101 2011        24.79871
# 3 2012 male     15 0.11764706 2012        50.32680
# 4 2013 male      5 0.16363636 2013        70.00000
# 5 2014 male      4 0.06896552 2014        29.50192
# 6 2015 male      6 0.04878049 2015        20.86721

ggplot(data=data, aes(x=year, y=number, fill=sex))  
  geom_bar(stat="identity", position=position_dodge())  
  geom_line(aes(x = match(year, sort(unique(year))), y = perFailedScaled), 
            data = ~ subset(., sex == "male"))  
  scale_y_continuous(sec.axis = sec_axis(~ . / ratio, labels = scales::label_percent()))  
  theme_bw()

ggplot2 barplot with line overlaid

Notes:

  1. I assume that since your numbers are invariant between "male" and "female", that reducing the data to just one of them is okay. Without the reduction, the line is double-printed, and while that's rarely impactful nor even noticed, it seems unnecessary.

  2. The use of match(year, sort(unique(year))) is a bit of a hack. Overlaying axes can be a little problematic, this is one way that works.

  • Related