subtract a vector from target columns for each data frame in a list-CodePudding

I have a list of data frames like:

totaldata <- list(structure(list(time = c(2, 3.9, 5.8, 7.8, 9.7, 11.7, 13.6, 
15.5, 17.5, 19.4), v = c(14.82, 14.804, 14.82, 14.82, 14.804, 
14.82, 14.812, 14.804, 14.8, 14.808), a = c(1.5, 1.476, 1.5, 
1.491, 1.452, 1.476, 1.478, 1.44, 1.454, 1.438), t1 = c(14.61, 
14.61, 14.61, 14.61, 14.61, 14.61, 14.61, 14.62, 14.62, 14.63
), t2 = c(14.63, 14.62, 14.62, 14.62, 14.62, 14.62, 14.62, 14.63, 
14.63, 14.64), t3 = c(14.63, 14.63, 14.63, 14.63, 14.63, 14.63, 
14.63, 14.63, 14.64, 14.65), t4 = c(14.65, 14.65, 14.65, 14.65, 
14.64, 14.64, 14.65, 14.65, 14.66, 14.67), t5 = c(14.65, 14.65, 
14.65, 14.65, 14.65, 14.65, 14.66, 14.66, 14.67, 14.69), t6 = c(14.63, 
14.63, 14.63, 14.63, 14.63, 14.63, 14.63, 14.64, 14.65, 14.66
), t7 = c(14.64, 14.64, 14.64, 14.64, 14.64, 14.64, 14.64, 14.64, 
14.65, 14.66), t8 = c(14.6, 14.6, 14.6, 14.6, 14.6, 14.6, 14.61, 
14.61, 14.62, 14.63)), row.names = c(NA, 10L), class = "data.frame"), 
    structure(list(time = c(21.4, 23.3, 25.3, 27.2, 29.2, 31.2, 
    33.1, 35.1, 37.1, 39), v = c(14.8, 14.804, 15.844, 15.848, 
    15.848, 15.852, 15.852, 15.848, 15.852, 15.852), a = c(1.442, 
    1.471, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002
    ), t1 = c(14.64, 14.65, 14.67, 14.7, 14.72, 14.75, 14.78, 
    14.82, 14.85, 14.89), t2 = c(14.65, 14.67, 14.69, 14.71, 
    14.74, 14.78, 14.82, 14.86, 14.9, 14.95), t3 = c(14.66, 14.68, 
    14.7, 14.73, 14.75, 14.79, 14.83, 14.86, 14.91, 14.95), t4 = c(14.68, 
    14.7, 14.73, 14.75, 14.79, 14.82, 14.86, 14.91, 14.95, 15
    ), t5 = c(14.7, 14.73, 14.75, 14.78, 14.81, 14.85, 14.89, 
    14.93, 14.97, 15.02), t6 = c(14.67, 14.69, 14.72, 14.74, 
    14.77, 14.8, 14.84, 14.88, 14.91, 14.95), t7 = c(14.67, 14.68, 
    14.7, 14.72, 14.75, 14.77, 14.8, 14.83, 14.86, 14.9), t8 = c(14.64, 
    14.66, 14.68, 14.71, 14.74, 14.77, 14.8, 14.84, 14.88, 14.92
    )), row.names = 11:20, class = "data.frame"))

I would like to do the following modification for each data frame in the list:

# target columns
columns = c(paste0('t', 1:8))
# calculate the mean of the given columns
colmeans = colMeans(DF[1:5, columns])
# substracts the means from these columns
DF[, columns] = t(t(DF[, columns]) - colmeans)

So I first take the mean of the first 5 rows for columns t1 ~ t8, then subtracting this mean from the entire column.

To do this for each data frame in the list, I have tried the following:

for(i in totaldata){
  colmeans = colMeans(i[1:5,colnames])
  i = t(t(i[, paste0('t', 1:8)]) - colmeans)
}

But It doesn't seem to work, And actually, I don't really have a good idea of what I'm doing.

CodePudding user response：

## subtract the column mean (based on first 5 rows) from all target columns
demean <- function (DF, columns) {
  MEANS <- colMeans(DF[1:5, columns])
  DF[columns] <- data.frame(Map(`-`, DF[columns], MEANS))
  DF
}

## apply `demean` over `totaldata`, which is a list of data frames
newdata <- lapply(totaldata, demean, columns = paste0("t", 1:8))

Remark

You can also rely on your original code to define demean:

demean <- function (DF, columns) {
  colmeans = colMeans(DF[1:5, columns])
  DF[, columns] = t(t(DF[, columns]) - colmeans)
  DF  ## don't forget to return it
}

Using a for-loop is also convenient here:

for (i in 1:length(totaldata)) {
  totaldata[[i]] <- demean(totaldata[[i]], paste0("t", 1:8))
}

Note that totaldata is overwritten by the end of the loop. By contrast, the lappy solution does not modify totaldata and create a new newdata.