Home > Software engineering >  R plot boxplot per column in dataframe and customizing parameters
R plot boxplot per column in dataframe and customizing parameters

Time:09-05

I have an R dataframe with 20 columns, one for each model. The lines of the dataset present the statistics for a boxplot. I want to plot a boxplot for each of those models, setting the parameters of the boxplot as the lines of the dataframe

Below is one example:

        Model 1    Model 2   ...  Model 20
min       1           5              15
q25       2           7              16
median    3           8              20
q75       4           9              21
max       5           10             22

As can be seeing, the statistics are already calculated. I just need to set them to the boxplot but I have no idea in how to do that

CodePudding user response:

In case you are willing to use ggplot2 you could try something like this:

Set up a fake dataset. Apparently, you need that to run ggplot() geom_boxplot():

df <- data.frame("Model" = "Model 1")

Then you can control the single boxplot components like this:

ggplot(df, aes(x = Model,
           ymin=5,     #min
           lower=20,   #q25
           middle=25,  #median
           upper=50,   #q75
           ymax=100))  #max
  geom_boxplot(stat="identity")  

enter image description here

Analogous for multiple models:

df <- data.frame("Model" = c("Model 1", "Model 2"))

ggplot(df, aes(x = Model,
           ymin=c(5, 9),
           lower=c(20,46),
           middle=c(25,55),
           upper=c(50,89),
           ymax=c(100, 111))) 
  geom_boxplot(stat="identity") 

CodePudding user response:

What has not been explained so far is that you need a matrix and not a data frame (since data frames are actually lists, the error refers to lists). I assume you somewhere also have the sample sizes, I rbind them here as a new row.

dat <- rbind(dat, n=c(20, 14, 60))

So all you need to do is coercing as.matrix.

bxp(list(stats=as.matrix(dat[1:5, ]), n=dat[6, ]))

enter image description here


Data:

dat <- read.table(header=TRUE, text='Model1    Model2   Model20
min       1           5              15
q25       2           7              16
median    3           8              20
q75       4           9              21
max       5           10             22')
  • Related