I want to loop through different datasets in a list, using lapply
, and in every item of the list through the columns, but only those that are saved in a vector called vector_test
. These variables I like to center, so basically subtract the weighted mean of every variable that is looped through in every dataset.
Let's assume I have the following 3 datasets saved in a list:
df1<-data.frame(v1=c(1,2,3,4,5,6,7),
v2=c(9,8,7,6,5,4,3),
v3=c(4,5,6,7,4,4,3),
v4=c(5,6,4,5,6,5,6))
df2<-data.frame(v1=c(1,5,3,4,9,6,7),
diff_var=c(1,3,4,6,2,3,4),
v2=c(9,8,2,6,3,4,3),
v3=c(4,5,6,7,3,4,3),
v4=c(5,2,4,4,6,1,6))
df3<-data.frame(v1=c(1,5,8,4,2,6,1),
v2=c(1,8,1,6,2,4,7),
v3=c(1,5,2,5,3,4,3),
v4=c(5,9,4,5,6,2,6))
test_liste<-list(df1,df2,df3)
Further, I have names of variables saved in a vector:
vector_test<-c("v3","v4")
Tried a for
loop/sapply
embedded in lapply
but cannot seem to figure out a way of only picking the variables that have identical names from the vector compared to the datasets.
If any clarfication is needed or additional code, please let me know!
Thanks in advance!
CodePudding user response:
Using lapply
you could do:
lapply(test_liste, function(x) {
x[vector_test] <- lapply(x[vector_test], function(x) x - mean(x))
x
})
#> [[1]]
#> v1 v2 v3 v4
#> 1 1 9 -0.7142857 -0.2857143
#> 2 2 8 0.2857143 0.7142857
#> 3 3 7 1.2857143 -1.2857143
#> 4 4 6 2.2857143 -0.2857143
#> 5 5 5 -0.7142857 0.7142857
#> 6 6 4 -0.7142857 -0.2857143
#> 7 7 3 -1.7142857 0.7142857
#>
#> [[2]]
#> v1 diff_var v2 v3 v4
#> 1 1 1 9 -0.5714286 1
#> 2 5 3 8 0.4285714 -2
#> 3 3 4 2 1.4285714 0
#> 4 4 6 6 2.4285714 0
#> 5 9 2 3 -1.5714286 2
#> 6 6 3 4 -0.5714286 -3
#> 7 7 4 3 -1.5714286 2
#>
#> [[3]]
#> v1 v2 v3 v4
#> 1 1 1 -2.2857143 -0.2857143
#> 2 5 8 1.7142857 3.7142857
#> 3 8 1 -1.2857143 -1.2857143
#> 4 4 6 1.7142857 -0.2857143
#> 5 2 2 -0.2857143 0.7142857
#> 6 6 4 0.7142857 -3.2857143
#> 7 1 7 -0.2857143 0.7142857