How to rename all columns to middle separator in R?-CodePudding

I'm looking for an easy solution to rename my columns to only the middle separator. Here's some mock data.

dat <- data.frame(
  
  subject = paste("Subject", 1:10),
  CT_tib_all = round(rnorm(10, 0.25, 0.03), 2),
  CT_lum_all = round(rnorm(10, 0.25, 0.03), 2),
  CT_tho_all = round(rnorm(10, 0.25, 0.03), 2)
  
)

I'd like to go from this:

    subject CT_tib_all CT_lum_all CT_tho_all
1 Subject 1       0.25       0.27       0.26
2 Subject 2       0.24       0.19       0.21

To this:

    subject        tib        lum        tho
1 Subject 1       0.25       0.27       0.26
2 Subject 2       0.24       0.19       0.21

Thanks!

CodePudding user response：

There may be more elegant solutions, but try this:

colnames(dat)[-1] <- sapply(strsplit(colnames(dat[-1]), "_"), function(x) x[2])

#> dat
#      subject  tib  lum  tho
#1   Subject 1 0.17 0.21 0.20
#2   Subject 2 0.27 0.23 0.28
# ...

This ignores the first (subject) column and sapply() will find everything in between the underscores for the remaining columns.

CodePudding user response：

With the {tidyverse} you can use dplyr::rename_with() and supply a function that splits at the _ and takes the 2nd element.

library(tidyverse)

dat <- data.frame(
  subject = paste("Subject", 1:10),
  CT_tib_all = round(rnorm(10, 0.25, 0.03), 2),
  CT_lum_all = round(rnorm(10, 0.25, 0.03), 2),
  CT_tho_all = round(rnorm(10, 0.25, 0.03), 2)
)

# create renaming function
f <- function(x) {
  x %>% 
    str_split("_") %>% 
    map(~.x[2]) %>% 
    unlist()
}

# rename with function at specified positions
dat %>% 
  rename_with(f, -1)
#>       subject  tib  lum  tho
#> 1   Subject 1 0.28 0.25 0.23
#> 2   Subject 2 0.26 0.25 0.26
#> 3   Subject 3 0.28 0.29 0.25
#> 4   Subject 4 0.30 0.26 0.24
#> 5   Subject 5 0.22 0.24 0.23
#> 6   Subject 6 0.26 0.28 0.29
#> 7   Subject 7 0.26 0.22 0.26
#> 8   Subject 8 0.29 0.26 0.25
#> 9   Subject 9 0.24 0.32 0.26
#> 10 Subject 10 0.21 0.23 0.27

^{Created on 2022-05-01 by the reprex package (v2.0.1)}

CodePudding user response：

Another option using rename_with() with str_extract() to extract the string between two "_" and rename columns accordingly excluding the first column.

library(dplyr)
library(stringr)

rename_with(dat, ~str_extract(.x, '(?<=_).*?(?=_)'), .cols = -1)

Also using purrr:map() as follows

library(purrr)

colnames(dat)[-1] <- map_chr(colnames(dat)[-1], ~strsplit(.x,'_')[[1]][2])