I have a list of data frames like:
totaldata <- list(structure(list(time = c(2, 3.9, 5.8, 7.8, 9.7, 11.7, 13.6,
15.5, 17.5, 19.4), v = c(14.82, 14.804, 14.82, 14.82, 14.804,
14.82, 14.812, 14.804, 14.8, 14.808), a = c(1.5, 1.476, 1.5,
1.491, 1.452, 1.476, 1.478, 1.44, 1.454, 1.438), t1 = c(14.61,
14.61, 14.61, 14.61, 14.61, 14.61, 14.61, 14.62, 14.62, 14.63
), t2 = c(14.63, 14.62, 14.62, 14.62, 14.62, 14.62, 14.62, 14.63,
14.63, 14.64), t3 = c(14.63, 14.63, 14.63, 14.63, 14.63, 14.63,
14.63, 14.63, 14.64, 14.65), t4 = c(14.65, 14.65, 14.65, 14.65,
14.64, 14.64, 14.65, 14.65, 14.66, 14.67), t5 = c(14.65, 14.65,
14.65, 14.65, 14.65, 14.65, 14.66, 14.66, 14.67, 14.69), t6 = c(14.63,
14.63, 14.63, 14.63, 14.63, 14.63, 14.63, 14.64, 14.65, 14.66
), t7 = c(14.64, 14.64, 14.64, 14.64, 14.64, 14.64, 14.64, 14.64,
14.65, 14.66), t8 = c(14.6, 14.6, 14.6, 14.6, 14.6, 14.6, 14.61,
14.61, 14.62, 14.63)), row.names = c(NA, 10L), class = "data.frame"),
structure(list(time = c(21.4, 23.3, 25.3, 27.2, 29.2, 31.2,
33.1, 35.1, 37.1, 39), v = c(14.8, 14.804, 15.844, 15.848,
15.848, 15.852, 15.852, 15.848, 15.852, 15.852), a = c(1.442,
1.471, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002
), t1 = c(14.64, 14.65, 14.67, 14.7, 14.72, 14.75, 14.78,
14.82, 14.85, 14.89), t2 = c(14.65, 14.67, 14.69, 14.71,
14.74, 14.78, 14.82, 14.86, 14.9, 14.95), t3 = c(14.66, 14.68,
14.7, 14.73, 14.75, 14.79, 14.83, 14.86, 14.91, 14.95), t4 = c(14.68,
14.7, 14.73, 14.75, 14.79, 14.82, 14.86, 14.91, 14.95, 15
), t5 = c(14.7, 14.73, 14.75, 14.78, 14.81, 14.85, 14.89,
14.93, 14.97, 15.02), t6 = c(14.67, 14.69, 14.72, 14.74,
14.77, 14.8, 14.84, 14.88, 14.91, 14.95), t7 = c(14.67, 14.68,
14.7, 14.72, 14.75, 14.77, 14.8, 14.83, 14.86, 14.9), t8 = c(14.64,
14.66, 14.68, 14.71, 14.74, 14.77, 14.8, 14.84, 14.88, 14.92
)), row.names = 11:20, class = "data.frame"))
I would like to do the following modification for each data frame in the list:
# target columns
columns = c(paste0('t', 1:8))
# calculate the mean of the given columns
colmeans = colMeans(DF[1:5, columns])
# substracts the means from these columns
DF[, columns] = t(t(DF[, columns]) - colmeans)
So I first take the mean of the first 5 rows for columns t1 ~ t8, then subtracting this mean from the entire column.
To do this for each data frame in the list, I have tried the following:
for(i in totaldata){
colmeans = colMeans(i[1:5,colnames])
i = t(t(i[, paste0('t', 1:8)]) - colmeans)
}
But It doesn't seem to work, And actually, I don't really have a good idea of what I'm doing.
CodePudding user response:
## subtract the column mean (based on first 5 rows) from all target columns
demean <- function (DF, columns) {
MEANS <- colMeans(DF[1:5, columns])
DF[columns] <- data.frame(Map(`-`, DF[columns], MEANS))
DF
}
## apply `demean` over `totaldata`, which is a list of data frames
newdata <- lapply(totaldata, demean, columns = paste0("t", 1:8))
Remark
You can also rely on your original code to define demean
:
demean <- function (DF, columns) {
colmeans = colMeans(DF[1:5, columns])
DF[, columns] = t(t(DF[, columns]) - colmeans)
DF ## don't forget to return it
}
Using a for
-loop is also convenient here:
for (i in 1:length(totaldata)) {
totaldata[[i]] <- demean(totaldata[[i]], paste0("t", 1:8))
}
Note that totaldata
is overwritten by the end of the loop. By contrast, the lappy
solution does not modify totaldata
and create a new newdata
.