I have code that runs and outputs a large list. I am stuck on writing the output to a file as I keep getting different errors so I haven't been able to write a file in any way that I normally would for a dataframe.
The code and data I'm using is this:
library(GeneOverlap)
library(dplyr)
library(stringr)
dataset1 <- structure(list(Gene = c("Gene1", "Gene1", "Gene2", "Gene3", "Gene3.",
"Gene3"), Gene_count = c(5L, 5L, 3L, 16L, 16L, 16L), Phenotype = c("Phenotype1",
"Phenotype2", "Phenotype1", "Phenotype6", "Phenotype2", "Phenotype1"
)), row.names = c(NA, -6L), class = c("data.table", "data.frame"
))
dataset2 <- structure(list(Gene = c("Gene1", "Gene1", "Gene4", "Gene2", "Gene6",
"Gene7"), Gene_count = c(10L, 10L, 4L, 17L, 3L, 2L), Phenotype = c("Phenotype1",
"Phenotype2", "Phenotype1", "Phenotype6", "Phenotype2", "Phenotype1"
)), row.names = c(NA, -6L), class = c("data.table", "data.frame"
))
d1_split <- split(dataset1, dataset1$Phenotype)
d2_split <- split(dataset2, dataset2$Phenotype)
# this should be TRUE in order for Map to work correctly
all(names(d1_split) == names(d2_split))
tests <- Map(function(d1, d2) {
go.obj <- newGeneOverlap(d1$Gene, d2$Gene, genome.size = 1871)
return(testGeneOverlap(go.obj))
}, d1_split, d2_split)
I then want to write out the tests
large list object to a file - ideally getting the p-values for each Phenotype
in the code above as a column. But I keep getting various errors relating to either things like:
library(Matrix)
library(data.table)
lstData <- Map(as.data.frame, tests)
Error in as.data.frame.default(dots[[1L]][[1L]]) :
cannot coerce class ‘structure("GeneOverlap", package = "GeneOverlap")’ to a data.frame
dfrData <- rbindlist(lstData)
Error in rbindlist(lstData) : object 'lstData' not found
Error in fwrite(tests, "list.csv") :
Column 1's type is 'S4' - not yet implemented in fwrite.
library(data.table)
outputfile <- "test.csv" #output file name
sep <- "," #define the separator (related to format of the output file)
for(nam in names(tests)){
fwrite(list(nam), file=outputfile, sep=sep, append=T) #write names of the list elements
ele <- tests[[nam]]
if(is.list(ele)) fwrite(ele, file=outputfile, sep=sep, append=T, col.names=T) else fwrite(data.frame(matrix(ele, nrow=1)), file=outputfile, append=T) #write elements of the list
fwrite(list(NA), file=outputfile, append=T) #add an empty row to separate elements
}
Error in as.vector(data) :
no method for coercing this S4 class to a vector
I've been trying to understand the S4 object but I'm a beginnger R user - what functions or packages could I use to write out my tests
object? Example data is included above to run all the code.
CodePudding user response:
The GeneOverlap package has several get*
functions for accessing test result statistics. You can combine this with the tidyverse to create a tidy table of results:
results <- tibble(pheno = names(tests), tests = tests) %>%
rowwise() %>%
mutate(
across(tests,
.fns = list(tested = getTested, pval = getPval, OR = getOddsRatio, jaccard = getJaccard),
.names = '{.fn}')
) %>%
select(-tests) # drop test object column
pheno tested pval OR jaccard
<chr> <lgl> <dbl> <dbl> <dbl>
1 Phenotype1 TRUE 0.00481 410. 0.2
2 Phenotype2 TRUE 0.00214 1302. 0.333
3 Phenotype6 TRUE 1 0 0
You can then save this data frame with write_csv
or a similar method.
CodePudding user response:
The CSV format is very simple: it is a text file, storing "comma-separated variables", where the variables are all strings. Some of the strings will be converted to numbers if they are in the right format.
S4 objects are very complicated things that are not easy to store as strings.
So to put an S4 object into a CSV file, you're going to need to convert it to one or more strings. You could use paste(dput(x), collapse="")
to convert x
to a string that could be restored as an S4 object later, but that won't give access to things stored in x
. You'll need to use something like @jdobres's approach to extract things before storing them as a CSV file, and then you probably won't be able to restore the object from the file.
If you do need to restore the S4 objects, use saveRDS()
on the list to store the complete list in an .rds
file. It will be readable by R, but not by other software.