Could you help me with this problem: I have a dataset where columns are numeric values. Some of the columns are sequencial. I would like to rename those sequencial column in the same name as from the column from where the sequence started.
Here a similar dataset to this example one:
fake_dataset <- data.frame(sample = paste0("sample_", sample(1:100, replace = T)),
"1678.47647" = runif(100, 1, 2),
"1679.84733" = runif(100, 1, 3),
"1680.87487" = runif(100, 2, 4),
"1800.35463" = runif(100, 1, 2),
"1811.47463" = runif(100, 2, 3),
"1823.52342" = runif(100, 2, 5)
)
colnames(fake_dataset) <- c("sample",
"1678.47647",
"1679.84733",
"1680.87487",
"1800.35463",
"1811.47463",
"1823.52342")
fake_dataset$sample <- NULL
My logic was to rename the column name value of the next sequencial column to the same name as the previous one, like this:
test <- function(data){
new_names <- c()
counter <- 0
for (i in as.integer(colnames(fake_dataset))){
counter <- counter 1
if(as.character( as.integer( names( data[counter] ) )) == as.character( as.integer( names( data[counter] ) ) 1) ) {
print("same!\n")
colname( data[, counter]) <- colnames( data[, counter 1])
}else{
print("different!\n")
}
}
}
But I haven't managed yet. Could anyone help? Thank you for you time.
CodePudding user response:
We may convert the colnames
to integer
, get the diff
erence between adjacent elements to create a grouping variable, use that in ave
to select the first element of the vector and assign it back as column names.
v1 <- as.integer(colnames(fake_dataset))
grp <- cumsum(c(TRUE, diff(v1) != 1))
new <- ave(v1, grp, FUN = function(x) x[1])
colnames(fake_dataset) <- new
-output
> colnames(fake_dataset)
[1] "1678" "1678" "1678" "1800" "1811" "1823"
NOTE: data.frame/tibble/data.table
doesn't support duplicate column names. It would be changed to unique values in subsequent transformations by using make.unique
i.e. adding .1
, .2
for duplicates. However, for a matrix
the duplicate column names are allowed