Home > Software engineering >  modify columns for a list of data frames
modify columns for a list of data frames

Time:06-12

I have multiple tables loaded using lapply. The last colname is not the same for all tables which makes it impossible to rbind the tables. I would like to make a new column that contains the colname in column 10 as ID, then rename the colname of column 10 to X.

file_list <- list.files(pattern = "edit")
tables <- lapply(file_list, read.csv, header = T, sep="\t")

colnames(tables[[984]])

 [1] "X.CHROM"                                      "POS"                                          "ID"                                          
 [4] "REF"                                          "ALT"                                          "QUAL"                                        
 [7] "FILTER"                                       "INFO"                                         "FORMAT"                                      
[10] "ID02020886"

My try:

tables <- map2(tables, file_list, ~cbind(.x, ID = .y)) #Does not work....

Desired out:

colnames(tables[[984]])

 [1] "X.CHROM"                                      "POS"                                          "ID"                                          
 [4] "REF"                                          "ALT"                                          "QUAL"                                        
 [7] "FILTER"                                       "INFO"                                         "FORMAT"                                      
[10] "X"                                            "ID"

CodePudding user response:

I would use a loop. I named the new column ID_new as another ID column is already there.

# sample data
tables <- lapply(c("ID1", "ID2", "ID3"), function(id) {
  cnames <- c("X.CHROM", "POS", "ID", "REF", "ALT", "QUAL", "FILTER", "INFO", 
              "FORMAT", id)
  as.data.frame(sapply(cnames, function(x) rnorm(5)))
})

# modify tables one by one
for (i in seq_along(tables)) {
  
  tables[[i]] <- cbind(tables[[i]], 
                       ID_new=colnames(tables[[i]])[10])
  colnames(tables[[i]])[10] <- 'X'
}

CodePudding user response:

Here is a solution, with built-in data set mtcars to test the code.

suppressPackageStartupMessages({
  library(dplyr)
  library(purrr)
})

file_list<- letters[1:2]
tables <- list(head(mtcars)[-1], head(mtcars)[-ncol(mtcars)])

tables %>%
  map2(file_list, ~cbind(.x, ID = .y)) %>%
  map( ~rename(.x, 'X' = names(.x)[10]))
#> [[1]]
#>                   cyl disp  hp drat    wt  qsec vs am gear X ID
#> Mazda RX4           6  160 110 3.90 2.620 16.46  0  1    4 4  a
#> Mazda RX4 Wag       6  160 110 3.90 2.875 17.02  0  1    4 4  a
#> Datsun 710          4  108  93 3.85 2.320 18.61  1  1    4 1  a
#> Hornet 4 Drive      6  258 110 3.08 3.215 19.44  1  0    3 1  a
#> Hornet Sportabout   8  360 175 3.15 3.440 17.02  0  0    3 2  a
#> Valiant             6  225 105 2.76 3.460 20.22  1  0    3 1  a
#> 
#> [[2]]
#>                    mpg cyl disp  hp drat    wt  qsec vs am X ID
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1 4  b
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1 4  b
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1 4  b
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0 3  b
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0 3  b
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0 3  b

Created on 2022-06-12 by the reprex package (v2.0.1)

  • Related