I am trying to label the min, median, and max data into the boxplot that I created. However, the boxplot is created with two different data frames, and thus it confused of how should I label the data value
Dummy variable:
Name <- c("Jon", "Bill", "Maria", "Ben", "Tina")
Age <- c(23, 41, 32, 58, 26)
class1<- data.frame(Name, Age)
boxplot(class1$Age)
Name1 <- c("Suma", "Mia", "Sam", "Jon", "Brian", "Grace", "Julia")
Age1<- c(33, 21, 56,32,65,32,89)
class2 <-data.frame(Name1, Age1)
boxplot(class1$Age, class2$Age1, names = c("class1", "class2"),ylab= "age", xlab= "class")
I am trying to include the data value into the boxplot (shown in image), and its indication (ex: min, median, max)
Many thanks
CodePudding user response:
You could use the function text
with fivenum
to get the numbers of each boxplot with labels
argument and place them using x and y positions like this:
Name <- c("Jon", "Bill", "Maria", "Ben", "Tina")
Age <- c(23, 41, 32, 58, 26)
class1<- data.frame(Name, Age)
Name1 <- c("Suma", "Mia", "Sam", "Jon", "Brian", "Grace", "Julia")
Age1<- c(33, 21, 56,32,65,32,89)
class2 <-data.frame(Name1, Age1)
boxplot(class1$Age, class2$Age1, names = c("class1", "class2"),ylab= "age", xlab= "class")
text(y = fivenum(class1$Age), labels = fivenum(class1$Age), x=0.5)
text(y = fivenum(class2$Age), labels = fivenum(class2$Age), x=2.5)
Created on 2023-01-01 with reprex v2.0.2
If you only want the min (1), median(3) and max(5) you can simply extract the first, third and fifth value of the fivenum
function like this:
boxplot(class1$Age, class2$Age1, names = c("class1", "class2"),ylab= "age", xlab= "class")
text(y = fivenum(class1$Age)[c(1,3,5)], labels = fivenum(class1$Age)[c(1,3,5)], x=0.5)
text(y = fivenum(class2$Age)[c(1,3,5)], labels = fivenum(class2$Age)[c(1,3,5)], x=2.5)
Created on 2023-01-01 with reprex v2.0.2
CodePudding user response:
The following code adds a new column Class
which contains the Classnames to both DF. With rbind
both DF are bind together.
Then the boxplot
is created in which at
defines a bit more space between each boxplot.
With tapply
fivenum
is calculated for each Class. And with these numbers a new DF is made which contain the necessary text for the annotations in text
.
Name <- c("Jon", "Bill", "Maria", "Ben", "Tina")
Age <- c(23, 41, 32, 58, 26)
Class <- rep("Class1", 5)
class1 <- data.frame(Name, Age, Class)
Name1 <- c("Suma", "Mia", "Sam", "Jon", "Brian", "Grace", "Julia")
Age1 <- c(33, 21, 56, 32, 65, 32, 89)
Class1 <- rep("Class2", 7)
class2 <- data.frame(Name = Name1, Age = Age1, Class = Class1)
df <- rbind(class1, class2)
bp <- boxplot(df$Age ~ factor(df$Class),
names = c("Class1", "Class2"),
ylim = c(0, 100),
xlim = c(0, 5),
xlab = "", ylab = "Age",
frame = F,
at = c(1, 3)
)
box(bty = "l")
fn <- tapply(df$Age, df$Class, fivenum)
tex <- data.frame(
Class = c("Class1", "Class2"),
max = c(fn$Class1[5], fn$Class2[5]),
min = c(fn$Class1[1], fn$Class2[1]),
median = c(fn$Class1[3], fn$Class2[3])
)
text(x = c(1, 3), y = tex$max 2.5, paste(tex$max, "(max)", sep = ""))
text(x = c(1, 3), y = tex$min - 2.5, paste(tex$min, "(min)", sep = ""))
text(x = c(1.9, 3.9), y = tex$median, paste(tex$median, "(median)", sep = ""))