I have this table
abc[1] | abc[1] | abc[2] | abc[2] | abc[3] | abc[3] | def[1] | def[1] | def[2] | def[2] | def[3] | def[3] | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
One | Two | Three | One | Two | Three | One | Two | Three | One | Two | Three | ||||||
One | Two | Three | One | Two | Three | One | Two | Three | One | Two | Three | One | Three | Two | One | Three | Two |
One | Two | Three | One | Two | Three | One | Two | Three | One | Two | Three |
It contains columns that have a name and then a number in brackets, like you see: [1], [2], [3].. This is just an example with numbers from 1 to 3, but it can be bigger, like a 100. I would like to have a script who selects and put in a new df all the columns containing in the name [1], [2], [3]...[n]. I know that I can select with tidy verse with select(contains("..")) but since the numbers in the brackets can change in my initial table (they can be 1 to 3 or 1 to 100) I need something applicable to different cases. I was thinking of maybe generating a string from 1:n, but the problem is that the file I have always have the number in brackets []. And I can't just use contain with the numbers because unfortunately the names of the columns contains characters and numbers. This is the result I am looking for:
New table 1
abc[1] | abc[1] | def[1] | def[1] |
---|---|---|---|
One | Two | One | Two |
One | Two | One | Two |
One | Two | One | Two |
One | Two | One | Two |
New table 2
abc[2] | abc[2] | def[2] | def[2] |
---|---|---|---|
One | Two | One | Two |
One | Two | One | Two |
One | Two | One | Two |
One | Two | One | Two |
New table 3
abc[3] | abc[3] | def[3] | def[3] |
---|---|---|---|
One | Two | One | Two |
One | Two | One | Two |
One | Two | One | Two |
One | Two | One | Two |
CodePudding user response:
You can use data.table
, in the following way
library(data.table)
setDT(df)
lapply(1:3, \(i) {
cols = (1:ncol(df))[grepl(paste0("\\[",i,"\\]"),names(df))]
df[,.SD, .SDcols = cols]
})
Output:
[[1]]
abc[1] abc[1] def[1] def[1]
1: One Two One Two
2: One Two One Two
3: One Two One Two
[[2]]
abc[2] abc[2] def[2] def[2]
1: Three One Three One
2: Three One Three One
3: Three One Three One
[[3]]
abc[3] abc[3] def[3] def[3]
1: Two Three Two Three
2: Two Three Two Three
3: Two Three Two Three
Input:
df = structure(list(`abc[1]` = c("One", "One", "One"), `abc[1]` = c("Two",
"Two", "Two"), `abc[2]` = c("Three", "Three", "Three"), `abc[2]` = c("One",
"One", "One"), `abc[3]` = c("Two", "Two", "Two"), `abc[3]` = c("Three",
"Three", "Three"), `def[1]` = c("One", "One", "One"), `def[1]` = c("Two",
"Two", "Two"), `def[2]` = c("Three", "Three", "Three"), `def[2]` = c("One",
"One", "One"), `def[3]` = c("Two", "Two", "Two"), `def[3]` = c("Three",
"Three", "Three")), row.names = c(NA, -3L), class = "data.frame")