I have a dataframe that looks like this -
df = data.frame(recall=c(0.55,0.62,0.43,0.61,0.19,0.14,0,0.19,0.33,0.33,0,0.33),
type= c("Phone numbers","Phone numbers","Phone numbers","Phone numbers","Emails","Emails","Emails","Emails","URLs","URLs","URLs","URLs"),
model=c("Cognition","TS-SAR","TS-ABINet","TS-RobustScanner",
"Cognition","TS-SAR","TS-ABINet","TS-RobustScanner",
"Cognition","TS-SAR","TS-ABINet","TS-RobustScanner"),
lb=c(0.47,0.55,0.35,0.53,
0.07,0.04,0,0.07,
0.14,0.14,0,0.14),
ub=c(0.63,0.7,0.51,0.69,
0.30,0.24,0,0.3,
0.52,0.52,0,0.52))
It consists of the results of 4 'text detection in image' ML models. The recall
column has the recall metric values for each model, based on the type
of text being detected (either Phone number, email or URLs). The ub
and lb
columns have the lower and bound values of recall of a 95% confidence interval.
Objective
I'd like to plot this in one graph using R.
Here is my attempt using ggplot2
pd <- position_dodge(width=0.2)
ggplot(df, aes(model,recall, color=type))
geom_point(aes(shape=type),size=4, position=pd)
scale_color_manual(name="Type",values=c("coral","steelblue"))
scale_shape_manual(name="Type",values=c(17,19))
theme_bw()
scale_x_continuous("Model", breaks=1:length(model), labels=model)
scale_y_continuous("Recall values")
geom_errorbar(aes(ymin=lb,ymax=ub),width=0.1,position=pd)
However this gives me an error message
Error in check_breaks_labels(breaks, labels) : object 'model' not found
Any ideas why this could be an error? Also I'm open to new ways of plotting this data, if anyone has suggestions. Thanks!
CodePudding user response:
Your code needs a couple of tweaks.
Firstly, ggplot
only uses non-standard evaluation inside aes
, so the use of model
inside scale_x_continuous
results in a "model
not found" error.
Secondly, the x axis isn't continuous. It's discrete. And the breaks / labels will be correct if you don't specify an x axis scale at all. You can just take the line out and trust the defaults.
Thirdly, the type
variable has three levels, so you need three values in the color and shape scales.
Putting these together, we have:
position <- position_dodge(width = 0.2)
ggplot(df, aes(model, recall, color = type))
geom_point(aes(shape = type), size = 4, position = pd)
geom_errorbar(aes(ymin = lb, ymax = ub), width = 0.1, position = pd)
scale_color_manual("Type", values = c("coral", "steelblue", "green4"))
scale_shape_manual("Type", values = c(17, 19, 18))
scale_y_continuous("Recall values")
theme_bw(base_size = 16)