Home > Back-end >  Why does read_csv2 read a data that should be a number or a character?
Why does read_csv2 read a data that should be a number or a character?

Time:05-27

With the following code I would like to import a thousand (small) csv files:

library(readr)

ldf <- list() # creates a list
listcsv <- dir(pattern = "*.csv") # creates the list of all the csv files in the directory
for (k in 1:length(listcsv)){
  ldf[[k]] <- read_csv2(listcsv[k])
  ldf[[k]] <- as.data.frame(ldf[[k]])
}

The operations seem to have been successful because the files have all been loaded into the list ldf. Anyway, some of them seem to show an issue. The first two columns of these thousand files should be chr but just very few of them have the second column of POSIXct format. I don't know why. The values in this column are:

0840-07-11
0840-07-11
0840-07-11
0840-07-11
0840-07-11
0840-07-11
0840-07-11
0840-07-11

whereas instead they should be:

08400711
08400711
08400711
08400711
08400711
08400711
08400711
08400711

CodePudding user response:

Here, the col_types specified as "c" will convert only the first column to character. If we need to force all columns to character, specify the .default (in case we don't know the number of columns before) or replicate "c" as "ccc" (equal to number of columns)

library(readr)
df1 <- read_csv(readr_example("mtcars.csv"), col_types = cols(.default = "c"))
> str(df1)
spec_tbl_df [32 × 11] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ mpg : chr [1:32] "21" "21" "22.8" "21.4" ...
 $ cyl : chr [1:32] "6" "6" "4" "6" ...
 $ disp: chr [1:32] "160" "160" "108" "258" ...
 $ hp  : chr [1:32] "110" "110" "93" "110" ...
 $ drat: chr [1:32] "3.9" "3.9" "3.85" "3.08" ...
 $ wt  : chr [1:32] "2.62" "2.875" "2.32" "3.215" ...
 $ qsec: chr [1:32] "16.46" "17.02" "18.61" "19.44" ...
 $ vs  : chr [1:32] "0" "0" "1" "1" ...
 $ am  : chr [1:32] "1" "1" "1" "0" ...
 $ gear: chr [1:32] "4" "4" "4" "3" ...
 $ carb: chr [1:32] "4" "4" "1" "1" ...
  • Related