Similar questions have been asked about converting a text string of data into a data frame (for example, here). However, I can't seem to adapt them to my problem.
I have a string of data that I'm trying to turn into a 4 column data frame. I managed to solve my problem using the readr::read_table
function (as shown below). However, I'm trying to do this in base R. I tried using base R's read.table
(for clarity, it is actually utils::read.table
and not base
... but Im referring to it as base R) but I cant seem to get it to work.
For example:
# text data
myText <- c("5 3 10\n3\n1 5 14 0.1005662213\n2 0 0 0.671371791\n3 0 0 0.3407034564\n3\n1 1 25 -0.5748688752\n2 0 0 -4.699291421\n3 0 0 -0.4393139217\n5\n1 5 35 0\n2 0 0 1.749283465\n3 0 67 0.1521562187\n6 0 0 -0.5545833321\n7 0 0 3.083556757\n1\n1 0 0 0.1563740906\n3\n1 1 25 -0.5748688752\n2 0 0 -4.352982824\n3 0 0 -0.05197710951\n5\n1 5 35 0\n2 0 0 2.425573501\n3 0 67 0.1521562187\n6 0 0 0.2505656058\n7 0 0 3.46201086\n3\n1 0 70 0.1563740906\n2 0 0 -0.8389369233\n3 0 0 -0.8127210366\n3\n1 1 25 -0.5748688752\n2 0 0 -4.125099073\n3 0 0 0.441967459\n5\n1 5 35 0\n2 0 0 1.337439399\n3 0 67 0.1521562187\n6 0 0 -0.03812773992\n7 0 0 2.488268982\n5\n1 0 70 0.1563740906\n2 0 0 -0.3505144781\n3 3 12 -0.8127210366\n6 0 0 -4.823541056\n7 0 0 1.200961188\n3\n1 1 25 -0.5748688752\n2 0 0 -4.615762984\n3 0 0 0.3397146156\n3\n1 5 35 0\n2 0 0 0.721465764\n3 0 0 0.4643481329\n5\n1 0 70 0.1563740906\n2 0 0 -1.004169113\n3 3 12 -0.8127210366\n6 0 0 -2.918580322\n7 0 0 2.114195803\n3\n1 1 25 -0.5748688752\n2 0 0 -4.894243443\n3 0 0 0.2303526511\n3\n1 5 35 0\n2 0 0 1.841081293\n3 0 0 1.204413054\n")
# turn into df using readr
df <- suppressWarnings(
readr::read_table(
file = myText,
col_names = c("idNum", "varNum", "val1", "val2"),
skip = 1,
na = c("")
)
)
> df
# A tibble: 68 × 4
idNum varNum val1 val2
<dbl> <dbl> <dbl> <dbl>
1 3 NA NA NA
2 1 5 14 0.101
3 2 0 0 0.671
4 3 0 0 0.341
5 3 NA NA NA
6 1 1 25 -0.575
7 2 0 0 -4.70
8 3 0 0 -0.439
9 5 NA NA NA
10 1 5 35 0
# … with 58 more rows
As you can see, I have converted the string into a 4 column data frame (tibble in this case). But I'm trying to avoid using any extra packages and achieve this using base R.
I tried read.table
from base R, but it gives an error:
dfNew <- read.table(file = myText,
col.names = c("idNum", "varNum", "val1", "val2"),
skip = 1,
na.strings = "NA")
> dfNew
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") : cannot open file '5 3 10
I'm not sure how to solve the error. The additional warning also seems to say that it is not skipping any lines before reading the data.
Any suggestions as to how I could solve this?
CodePudding user response:
As you are reading from a character vector use the text
argument instead of file
. Also, as not all rows contain 4 values use fill=NA
to fill missing values with NA
:
df <- read.table(text = myText, skip = 1, fill = NA, col.names = c("idNum", "varNum", "val1", "val2"))
head(df)
#> idNum varNum val1 val2
#> 1 3 NA NA NA
#> 2 1 5 14 0.1005662
#> 3 2 0 0 0.6713718
#> 4 3 0 0 0.3407035
#> 5 3 NA NA NA
#> 6 1 1 25 -0.5748689