I want to produce a table similar to:
But I'm having a hard time naming the rows because the GeneName should just be a label that species the table belongs to that gene.
My code for the data frame is:
geneTable <- data.frame(presenceofvariant = c("Yes", "No"),
A = c(1, 4),
B = c(2, 4))
I want the column label above "Yes" and "No" to be empty but this doesn't seem possible either. How do I add an overall row name - I.e the gene all this data belongs to?
CodePudding user response:
Not sure if you want this. You can use the argument row.names
to supply row names, and use check.names = FALSE
to have column names with special syntax.
data.frame(" " = c("Yes", "No"),
A = c(1, 4),
B = c(2, 4),
row.names = c("GeneName", ""),
check.names = F)
A B
GeneName Yes 1 2
No 4 4
To my knowledge, it's not possible to have duplicated row names in dataframe. If you really want that, we can use matrix instead of dataframe.
matrix(c("Yes", "No", 1, 4, 2, 4),
nrow = 2,
dimnames = list(c("GeneName", "GeneName"), c("", "A", "B")))
A B
GeneName "Yes" "1" "2"
GeneName "No" "4" "4"
CodePudding user response:
When you say 'overall row name' I think the dimnames
of a matrix
would suit well here. A data.frame
is strictly rectangular and doesn't allow this type of 'meta labeling' - instead you'd have to just add another column. So to your example, you could do this:
matrix(1:4,
nrow = 2,
dimnames = list("GeneName" = c("YES", "NO"),
"categories" = c("A", "B")))
#> categories
#> GeneName A B
#> YES 1 3
#> NO 2 4
Created on 2022-04-08 by the reprex package (v2.0.1)
Alternatively, a 'tidy' way to do this in a data.frame
might be:
data.frame(gene = c("geneA", "geneA", "geneB", "geneB"),
Answer = c("YES", "NO", "YES", "NO"),
A = sample(4),
B = sample(4))
#> gene Answer A B
#> 1 geneA YES 2 3
#> 2 geneA NO 3 4
#> 3 geneB YES 4 1
#> 4 geneB NO 1 2
Created on 2022-04-08 by the reprex package (v2.0.1)