Home > OS >  Add filename to a new column when using map_df in r
Add filename to a new column when using map_df in r

Time:09-07

Is there a quick and easy way using dplyr to add a column called 'site_id' which populates rows from the number given to the filename when using map_df from purrr package to bring the data in to one dataframe?

For example my.files will read in two csv files: "H:/Documents/2015.csv" and "H:/Documents/2021.csv"

my.files <- list.files(my.path, pattern = "*.csv", full.names = TRUE)

I then use map_df to bring all the data in to one data frame, but would like to create an additional column called 'site_id' that will populate each row from that file with its original file title e.g. 2015 or 2021

I currently merge the .csv files together with this code:

temp.df <- my.files %>% map_df(~read.csv(., skip = 15))

But I envisage using mutate to help but am unsure how it would work...

temp.df <- my.files %>% map_df(~read.csv(., skip = 15) %>%
 mutate(site_id = ????))

Any help is much appreciated.

CodePudding user response:

We may use imap if we want to use mutate

library(dplyr)
library(purrr)
setNames(my.files, my.files) %>%
   imap_df(~ read.csv(.x, skip = 15) %>%
             mutate(site_id = .y))

Or specify the .id in map

setNames(my.files, my.files) %>%
     map_dfr(read.csv, skip = 15, .id = "site_id")

CodePudding user response:

Using purrr & dplyr:

temp.df <- my.files %>%
  purrr::set_names() %>% 
  purrr::map(., ~read.csv(., skip = 15)) %>% 
  dplyr::bind_rows(.id = "site_id")
  • Related