Home > database >  Labelling min, median, max of boxplot, using R-base
Labelling min, median, max of boxplot, using R-base

Time:01-02

I am trying to label the min, median, and max data into the boxplot that I created. However, the boxplot is created with two different data frames, and thus it confused of how should I label the data value

Dummy variable:

Name <- c("Jon", "Bill", "Maria", "Ben", "Tina")
Age <- c(23, 41, 32, 58, 26)
class1<- data.frame(Name, Age)

boxplot(class1$Age)
Name1 <- c("Suma", "Mia", "Sam", "Jon", "Brian", "Grace", "Julia")
Age1<- c(33, 21, 56,32,65,32,89)
class2 <-data.frame(Name1, Age1)

boxplot(class1$Age, class2$Age1, names = c("class1", "class2"),ylab= "age", xlab= "class")

output

I am trying to include the data value into the boxplot (shown in image), and its indication (ex: min, median, max)

image

Many thanks

CodePudding user response:

You could use the function text with fivenum to get the numbers of each boxplot with labels argument and place them using x and y positions like this:

Name <- c("Jon", "Bill", "Maria", "Ben", "Tina")
Age <- c(23, 41, 32, 58, 26)
class1<- data.frame(Name, Age)
Name1 <- c("Suma", "Mia", "Sam", "Jon", "Brian", "Grace", "Julia")
Age1<- c(33, 21, 56,32,65,32,89)
class2 <-data.frame(Name1, Age1)

boxplot(class1$Age, class2$Age1, names = c("class1", "class2"),ylab= "age", xlab= "class")
text(y = fivenum(class1$Age), labels = fivenum(class1$Age), x=0.5)
text(y = fivenum(class2$Age), labels = fivenum(class2$Age), x=2.5)

Created on 2023-01-01 with reprex v2.0.2


If you only want the min (1), median(3) and max(5) you can simply extract the first, third and fifth value of the fivenum function like this:

boxplot(class1$Age, class2$Age1, names = c("class1", "class2"),ylab= "age", xlab= "class")
text(y = fivenum(class1$Age)[c(1,3,5)], labels = fivenum(class1$Age)[c(1,3,5)], x=0.5)
text(y = fivenum(class2$Age)[c(1,3,5)], labels = fivenum(class2$Age)[c(1,3,5)], x=2.5)

Created on 2023-01-01 with reprex v2.0.2

CodePudding user response:

The following code adds a new column Class which contains the Classnames to both DF. With rbind both DF are bind together.

Then the boxplot is created in which at defines a bit more space between each boxplot.

With tapply fivenum is calculated for each Class. And with these numbers a new DF is made which contain the necessary text for the annotations in text.

Name <- c("Jon", "Bill", "Maria", "Ben", "Tina")
Age <- c(23, 41, 32, 58, 26)
Class <- rep("Class1", 5)
class1 <- data.frame(Name, Age, Class)
Name1 <- c("Suma", "Mia", "Sam", "Jon", "Brian", "Grace", "Julia")
Age1 <- c(33, 21, 56, 32, 65, 32, 89)
Class1 <- rep("Class2", 7)
class2 <- data.frame(Name = Name1, Age = Age1, Class = Class1)


df <- rbind(class1, class2)

bp <- boxplot(df$Age ~ factor(df$Class),
  names = c("Class1", "Class2"),
  ylim = c(0, 100),
  xlim = c(0, 5),
  xlab = "", ylab = "Age",
  frame = F,
  at = c(1, 3)
)
box(bty = "l")

fn <- tapply(df$Age, df$Class, fivenum)

tex <- data.frame(
  Class = c("Class1", "Class2"),
  max = c(fn$Class1[5], fn$Class2[5]),
  min = c(fn$Class1[1], fn$Class2[1]),
  median = c(fn$Class1[3], fn$Class2[3])
)


text(x = c(1, 3), y = tex$max   2.5, paste(tex$max, "(max)", sep = ""))
text(x = c(1, 3), y = tex$min - 2.5, paste(tex$min, "(min)", sep = ""))
text(x = c(1.9, 3.9), y = tex$median, paste(tex$median, "(median)", sep = ""))

  • Related