R: how to read multiple csv files with column name in row n and select certain columns from the file-CodePudding

I have 100 csv files in the same folder, let's say the path="D:\Data".

For each file I want to:

Step 1. read the file from row 12 since the column names are at row 12;

Step 2. select certain columns from the file, let's say the colname I want to keep are "Date","Time","Value";

Step 3. add the file name to the file as a new column, for example, I want to save file1 of which name is "example 1.csv" as file1$Name="example 1.csv", and similarly, save file2 of which name is "example 2.csv" as file2$Name="example 2.csv", etc...

So far we got 100 new files with 4 columns "Date","Time","Value","Name". Then finally rbind all the 100 new files together.

I have no idea how to code these steps all together in R. So anyone can help? Thanks very much for your time.

CodePudding user response：

You could try something like this

list_of_files <- list.files(path <- "D:/Data/", pattern="*.csv", full.names=TRUE)

library(dplyr)
library(purrr)
list_of_files %>%
  set_names() %>%
  map_dfr(~ .x %>%
            readr::read_csv(.,
                     skip = 12, 
                     col_names = TRUE
            ) %>% 
            select(Date, Time, Value) %>% 
# Alternatively you could use the .id argument in map_dfr for the filename
            mutate(filename = match(.x, list_of_files)))

CodePudding user response：

You can do this very neatly with vroom. It can take a list of files as an argument rather than having to do each separately, and add the filename column itself:

library(vroom)

vroom(files, skip = 11, id = 'filename', col_select = c(Date, Time, Value, filename))