Home > Mobile >  select subset of columns when using col_type() in readr
select subset of columns when using col_type() in readr

Time:10-07

I'm trying to read in a file with read_delim() and select a subset (a long run) of columns to define as a specific type.

As an example, I have a file with 6 columns. I want to select column 1 ('name') as character, but then select columns 2-6 as integer. I can do this by manually specifying the column names:

df <- read_delim(file = "data.txt", col_type = list(name = col_character(), id_1 = col_integer(), id_2 = col_integer(), id_3 = col_integer(), id_4 = col_integer(), id_5 = col_integer()), delim = " ")

But my data has 100s of columns and I want to select a subset/run of columns without writing them out manually.

I've tried:

df <- read_delim(file = "data.txt", col_type = list(name = col_character(), id_1:id_5 = col_integer()), delim = " ")

and

df <- read_delim(file = "data.txt", col_type = list(name = col_character(), select('id_1':'id_5') = col_integer()), delim = " ")

But I get an error:

Error: unexpected '=' in:
"col_type = list(name = col_character(), select('id_1':'id_5') ="

I'm sure this is very simple but I've spent hours and hours trying to work it out!

CodePudding user response:

One option is to pass a named list with setNames

df <- read_delim(file = "data.txt", 
     col_type = setNames( c(list(col_character()),  
           rep(list(col_integer()), 5)),
             c("name", paste0("id_", 1:5))), delim = " ")
  • Related