Home > front end >  Why wont my input boxplot code not work for my matrix?
Why wont my input boxplot code not work for my matrix?

Time:02-14

I am having issues getting a plot for my dataframe. Attached below is my sample matrix dataframe. This class is confirmed a dataframe.

Dataframe (10 rows, 59 columns originally)

I have originally tried to create a plot using facet_grid, with poor results. (Error: At least one layer must contain all faceting variables: x.

  • Plot is missing x
  • Layer 1 is missing x)

Instead, I decided to simplify it and create a box plot with everything on 1 graph. However, my graph looked like this: ugly graph

My simple plot code is below; does anyone know why things are plotted poorly? Any insight is helpful. This is my first post, so I hope things are formatted correctly.

ggplot(newdf, aes(x, y, fill = x)) geom_boxplot()

The end goal would be a boxplot shown for each gene with the observations of the individuals within each box. I am following this example:example

CodePudding user response:

First of all, when following examples you should make sure to change the variable's names in them to your variable's names.

In ggplot(newdf, aes(x, y, fill = x)) geom_boxplot(), the errors probably come from the fact that you have neither an x or y column in newdf. Or if you do, they aren't what you need for your goal.

The end goal being

a boxplot shown for each gene with the observations of the individuals within each box

your x variabe needs to be a column in which you have the genes names and y a column with the value of the observation for these gene.

In other word, you need to reformat your data.frame into a long format (right now you have what's called a wide format).

There many ways to do so, a simple one with base R is to repeat the name of each column times the number of row for the gene column and then unlisting your data.frame into the value column like so :

# needed a a data.frame so made a fake one to illustrate :
newdf <- data.frame(gene1 = runif(10, ), gene2 = runif(10), gene3 = runif(10))
# this convert your data.frame from wide to long
newdf.long <- data.frame(
  gene = rep(names(newdf), each=nrow(newdf)), 
  value = unlist(newdf), 
  row.names = NULL
)

Then, you need to make sure to use the right variable's name when building your ggplot :

ggplot(newdf.long, aes(x = gene, y = value, fill = gene))   geom_boxplot()
  • Related