Home > OS >  R plotting a graph with confidence intervals
R plotting a graph with confidence intervals

Time:05-22

I have a dataframe that looks like this -

df = data.frame(recall=c(0.55,0.62,0.43,0.61,0.19,0.14,0,0.19,0.33,0.33,0,0.33),
                type= c("Phone numbers","Phone numbers","Phone numbers","Phone numbers","Emails","Emails","Emails","Emails","URLs","URLs","URLs","URLs"),
                model=c("Cognition","TS-SAR","TS-ABINet","TS-RobustScanner",
                        "Cognition","TS-SAR","TS-ABINet","TS-RobustScanner",
                        "Cognition","TS-SAR","TS-ABINet","TS-RobustScanner"),
                lb=c(0.47,0.55,0.35,0.53,
                     0.07,0.04,0,0.07,
                     0.14,0.14,0,0.14),
                ub=c(0.63,0.7,0.51,0.69,
                     0.30,0.24,0,0.3,
                     0.52,0.52,0,0.52))

It consists of the results of 4 'text detection in image' ML models. The recall column has the recall metric values for each model, based on the type of text being detected (either Phone number, email or URLs). The ub and lb columns have the lower and bound values of recall of a 95% confidence interval.

Objective

I'd like to plot this in one graph using R.

Here is my attempt using ggplot2

pd <- position_dodge(width=0.2)

ggplot(df, aes(model,recall, color=type))  
  geom_point(aes(shape=type),size=4, position=pd)  
  scale_color_manual(name="Type",values=c("coral","steelblue"))  
  scale_shape_manual(name="Type",values=c(17,19))  
  theme_bw()  
  scale_x_continuous("Model", breaks=1:length(model), labels=model)  
  scale_y_continuous("Recall values")  
  geom_errorbar(aes(ymin=lb,ymax=ub),width=0.1,position=pd)

However this gives me an error message

Error in check_breaks_labels(breaks, labels) : object 'model' not found

Any ideas why this could be an error? Also I'm open to new ways of plotting this data, if anyone has suggestions. Thanks!

CodePudding user response:

Your code needs a couple of tweaks.

Firstly, ggplot only uses non-standard evaluation inside aes, so the use of model inside scale_x_continuous results in a "model not found" error.

Secondly, the x axis isn't continuous. It's discrete. And the breaks / labels will be correct if you don't specify an x axis scale at all. You can just take the line out and trust the defaults.

Thirdly, the type variable has three levels, so you need three values in the color and shape scales.

Putting these together, we have:

position <- position_dodge(width = 0.2)

ggplot(df, aes(model, recall, color = type))  
  geom_point(aes(shape = type), size = 4, position = pd)  
  geom_errorbar(aes(ymin = lb, ymax = ub), width = 0.1, position = pd)  
  scale_color_manual("Type", values = c("coral", "steelblue", "green4"))  
  scale_shape_manual("Type", values = c(17, 19, 18))  
  scale_y_continuous("Recall values")  
  theme_bw(base_size = 16) 

enter image description here

  • Related