Home > OS >  Using purr::map to rename columns based on another list in R
Using purr::map to rename columns based on another list in R

Time:10-23

I have multiple files and I want to rename the second column of each file with a name coming from the samples=c("sample1","sample2") dataset. As I am learning purr::map functions, I am struggling to do the renaming with the inside map.

Here is an example: Any help is extremely appreciated

library(purrr)
library(data.table)
library(dplyr)

files <- paste0("file", 1:3, ".txt") 

## Create example files in a temp dir
temp <- tempdir()

walk(files, ~ write.csv(iris[1:2], file.path(temp, .x), row.names = FALSE))

files |> 
  map(~ fread(file.path(temp, .x)) %>% rename(test = 1, samples=2))

Of course, this does not work, but this is here I am so far.

CodePudding user response:

This is one way to do it:

We use map2() and loop over both files and samples and for each file we first read in the data fread(file,path(temp, .x)) and then pipe that into rename(., test = 1, !! sym(.y) := 2)).

samples contains strings. We need to make the strings into object names with sym (or alternatively as.name()) and evaluate them with !!. If we use this kind of syntax on the lefthand side we also need the walrus operator := instead of =.

samples=c("sample1","sample2", "sample3")

files |> 
  map2(samples, ~ fread(file.path(temp, .x)) %>% rename(., test = 1, !! sym(.y) := 2))

If you want to rename a different column in every data.frame its better to construct a list of lists as below and splice each sublist into rename() with !!!. (The example below just uses the second column but we could change that to any column number we want).

samples = list(
  list("sample1" = 2),
  list("sample2" = 2),
  list("sample3" = 2)
)

files |> 
  map2(samples, ~ fread(file.path(temp, .x)) %>% rename(., test = 1, !!! .y))

Since you are using data.table to read-in the data we don't need dyplr::rename() to rename the colums. Especially the case where you want to rename each second column is easier with data.table::setnames():

samples = c("sample1", "sample2","sample3")

files |> 
  map2(samples, ~ fread(file.path(temp, .x)) %>% setnames(., 1, .y))
  • Related