Splitting columns based on a specific value in the name with R-CodePudding

I have this table

abc[1]	abc[1]	abc[2]	abc[2]	abc[3]	abc[3]	def[1]	def[1]	def[2]	def[2]	def[3]	def[3]
One	Two	Three	One	Two	Three	One	Two	Three	One	Two	Three
One	Two	Three	One	Two	Three	One	Two	Three	One	Two	Three	One	Three	Two	One	Three	Two
One	Two	Three	One	Two	Three	One	Two	Three	One	Two	Three

It contains columns that have a name and then a number in brackets, like you see: [1], [2], [3].. This is just an example with numbers from 1 to 3, but it can be bigger, like a 100. I would like to have a script who selects and put in a new df all the columns containing in the name [1], [2], [3]...[n]. I know that I can select with tidy verse with select(contains("..")) but since the numbers in the brackets can change in my initial table (they can be 1 to 3 or 1 to 100) I need something applicable to different cases. I was thinking of maybe generating a string from 1:n, but the problem is that the file I have always have the number in brackets []. And I can't just use contain with the numbers because unfortunately the names of the columns contains characters and numbers. This is the result I am looking for:

New table 1

abc[1]	abc[1]	def[1]	def[1]
One	Two	One	Two
One	Two	One	Two
One	Two	One	Two
One	Two	One	Two

New table 2

abc[2]	abc[2]	def[2]	def[2]
One	Two	One	Two
One	Two	One	Two
One	Two	One	Two
One	Two	One	Two

New table 3

abc[3]	abc[3]	def[3]	def[3]
One	Two	One	Two
One	Two	One	Two
One	Two	One	Two
One	Two	One	Two

CodePudding user response：

You can use data.table, in the following way

library(data.table)

setDT(df)

lapply(1:3, \(i)  {
  cols = (1:ncol(df))[grepl(paste0("\\[",i,"\\]"),names(df))]
  df[,.SD, .SDcols = cols]
})

Output:

[[1]]
   abc[1] abc[1] def[1] def[1]
1:    One    Two    One    Two
2:    One    Two    One    Two
3:    One    Two    One    Two

[[2]]
   abc[2] abc[2] def[2] def[2]
1:  Three    One  Three    One
2:  Three    One  Three    One
3:  Three    One  Three    One

[[3]]
   abc[3] abc[3] def[3] def[3]
1:    Two  Three    Two  Three
2:    Two  Three    Two  Three
3:    Two  Three    Two  Three

Input:

df = structure(list(`abc[1]` = c("One", "One", "One"), `abc[1]` = c("Two", 
"Two", "Two"), `abc[2]` = c("Three", "Three", "Three"), `abc[2]` = c("One", 
"One", "One"), `abc[3]` = c("Two", "Two", "Two"), `abc[3]` = c("Three", 
"Three", "Three"), `def[1]` = c("One", "One", "One"), `def[1]` = c("Two", 
"Two", "Two"), `def[2]` = c("Three", "Three", "Three"), `def[2]` = c("One", 
"One", "One"), `def[3]` = c("Two", "Two", "Two"), `def[3]` = c("Three", 
"Three", "Three")), row.names = c(NA, -3L), class = "data.frame")