Home > Back-end >  R: how to read multiple csv files with column name in row n and select certain columns from the file
R: how to read multiple csv files with column name in row n and select certain columns from the file

Time:07-29

I have 100 csv files in the same folder, let's say the path="D:\Data".

For each file I want to:

Step 1. read the file from row 12 since the column names are at row 12;

Step 2. select certain columns from the file, let's say the colname I want to keep are "Date","Time","Value";

Step 3. add the file name to the file as a new column, for example, I want to save file1 of which name is "example 1.csv" as file1$Name="example 1.csv", and similarly, save file2 of which name is "example 2.csv" as file2$Name="example 2.csv", etc...

So far we got 100 new files with 4 columns "Date","Time","Value","Name". Then finally rbind all the 100 new files together.

I have no idea how to code these steps all together in R. So anyone can help? Thanks very much for your time.

CodePudding user response:

You could try something like this

list_of_files <- list.files(path <- "D:/Data/", pattern="*.csv", full.names=TRUE)

library(dplyr)
library(purrr)
list_of_files %>%
  set_names() %>%
  map_dfr(~ .x %>%
            readr::read_csv(.,
                     skip = 12, 
                     col_names = TRUE
            ) %>% 
            select(Date, Time, Value) %>% 
# Alternatively you could use the .id argument in map_dfr for the filename
            mutate(filename = match(.x, list_of_files)))

CodePudding user response:

You can do this very neatly with vroom. It can take a list of files as an argument rather than having to do each separately, and add the filename column itself:

library(vroom)

vroom(files, skip = 11, id = 'filename', col_select = c(Date, Time, Value, filename))
  •  Tags:  
  • r
  • Related