Is there a way to use data.table::fwrite
to write the values of a column without any separation between them?
For example:
library("data.table")
geno <- data.table(
IID = 1:10,
SNP = lapply(1:10, function(i) sample(0:2, 10, replace = TRUE))
)
fwrite(geno, "Geno.txt", col.names = FALSE, sep = " ", sep2 = c("","",""))
But the sep2 does not allow it and gives me the following error:
Error in fwrite(geno, "Geno.txt", col.names = FALSE, row.names = FALSE, :
is.character(sep2) && length(sep2) == 3L && nchar(sep2[2L]) == .... is not TRUE
I would like to have the following result, without having to collapse all values before writing it to a file.
1 2221210202
2 0020010221
3 1010022212
4 0120121221
5 1212211202
6 2100002010
7 1110011210
8 1212012121
9 2221121021
10 1122220101
Thank you.
CodePudding user response:
According to ?fwrite
, sep2[2] must be a single character. Therefore you have to collapse the list, rather than use sep2.
You can use
fwrite(geno[, .(IID, SNP=sapply(SNP, paste0, collapse=''))], 'test.txt', sep=' ')
CodePudding user response:
Alternative: write it with a character known to not exist in the data and then remove it programmatically on the file. The second step here can be done in R, but frankly command-line tools are much faster at this. I'll use tr
here, as it is likely to be the fastest.
fwrite(geno, "Geno.txt", col.names = FALSE, sep = " ", sep2 = c("","\037",""))
readLines("Geno.txt", n=2)
# [1] "1 1\0372\0371\0372\0372\0370\0371\0370\0372\0370" "2 1\0370\0372\0372\0371\0370\0372\0372\0371\0370"
system2("tr", c("-d", "\037"), stdin="Geno.txt", stdout="Geno2.txt")
readLines("Geno2.txt", n=2)
# [1] "1 1212201020" "2 1022102210"
tr
should be available on all unix-like OSes including MacOS, and within Rtools-4.0 for windows under "c:\\rtools40\\usr\\bin\\tr.exe"
or whichever path is closest for your install.
For this, I chose the unicode \037
which is used by many things as a "Delimiter", and seems unlikely to be found in most datasets. However, others will work just as easily, including sep2 = c("", "|", "")
with "system2(tr", c("-d", "|"), ...)
.