Home > front end >  Extracting column name from CSV filename during import
Extracting column name from CSV filename during import

Time:02-27

I have 24 CSV files to import, each with three columns of data: type (CHR), buffer (INT), mean (NUM). I import these using:

data <- list.files(path = "...", pattern = "*.csv", full.names = TRUE) %>% 
lapply(read_csv) %>%                                            
bind_rows

This creates a 'long' data table containing three columns of data.

To enable monthly analysis I need to extract a date from the filename of each imported CSV to create a 'wide' table (with a column for each date), or a 'long' table with date as another column).

Although I've found quite a few suggested ways of doing this I can't get anything to work (I'm relatively new to R) and have become a b it confused by unfamiliar syntax.

The CSV filename is in the format ABC_YYYY-MM-DD.CSV

CodePudding user response:

This would create a long table with date as 4th column -

library(tidyverse)

data <- list.files(path = "...", pattern = "*.csv", full.names = TRUE) %>%
          sapply(read_csv, simplify = FALSE) %>%
          imap_dfr(~.x %>% 
          mutate(date = sub('.*(\\d{4}-\\d{2}-\\d{2}).*', '\\1', basename(.y))))

sapply with simplify = FALSE would create a list with names of the list as file name. Using imap_dfr we combine all the data in one dataframe and create a new column date extract the date from the list name.

  •  Tags:  
  • r
  • Related