Home > database >  Cannot read csv file with R
Cannot read csv file with R

Time:06-08

I have a csv file that looks like:

,,,,,,,,
,,,,a,b,c,d,e
,,,"01.01.2022, 00:00 - 01:00",82.7,57.98,0.0,0.0,0.0
,,,"01.01.2022, 01:00 - 02:00",87.6,50.05,15.0,25.570000000000004,383.55000000000007
,,,"01.01.2022, 02:00 - 03:00",87.6,41.33,0.0,0.0,0.0

And I want to import headers first and then the data, and finally insert headers to the names of the table with data

file <- "path"

pnl <- read.csv(file, dec = ",")  #, skip = 1, header = TRUE)
headers <- read.csv(file, skip = 1, header = F, nrows = 1, as.is = T)

df  <- read.csv(file, skip = 2, header = F, as.is = T)

#or this
#df <- read.csv(file, skip = 2, header = F, nrow = 1,dec = ".",sep=",", quote = "\"")
colnames(df) <- headers

When importing headers I have multiple columns with the headers entries. However, when importing table all entries are put inside one column, the same as in csv file (should be multiple columns). How can I solve it with read.csv() function?

CodePudding user response:

like this?

data.table::fread(',,,,,,,,
                  ,,,,a,b,c,d,e
                  ,,,"01.01.2022, 00:00 - 01:00",82.7,57.98,0.0,0.0,0.0
                  ,,,"01.01.2022, 01:00 - 02:00",87.6,50.05,15.0,25.570000000000004,383.55000000000007
                  ,,,"01.01.2022, 02:00 - 03:00",87.6,41.33,0.0,0.0,0.0', 
                  skip = 1)

   V1 V2 V3                        V4    a     b  c     d      e
1: NA NA NA 01.01.2022, 00:00 - 01:00 82.7 57.98  0  0.00   0.00
2: NA NA NA 01.01.2022, 01:00 - 02:00 87.6 50.05 15 25.57 383.55
3: NA NA NA 01.01.2022, 02:00 - 03:00 87.6 41.33  0  0.00   0.00

CodePudding user response:

Without using any libraries:


colClasses <- c("NULL", "NULL", "NULL", "character", "numeric", "numeric", "numeric", "numeric")

read.csv(file, header = TRUE, skip = 1, colClasses = colClasses)

#                         X.3    a     b  c     d
# 1 01.01.2022, 00:00 - 01:00 82.7 57.98  0  0.00
# 2 01.01.2022, 01:00 - 02:00 87.6 50.05 15 25.57
# 3 01.01.2022, 02:00 - 03:00 87.6 41.33  0  0.00

You will want to rename the first column.

  •  Tags:  
  • r csv
  • Related