I have a data set with 1000 variables. The naming fashion of the variable is as shown in the figure below.
Now I want to use a loop function to standardize each of these 1000 variables and keep their original names. That is, I want the new "SCORE.1" to be the standardized "SCORE.1", new "SCORE.2" is the standardized "SCORE.2".
How can I do this? Many thanks!
CodePudding user response:
Perhaps it would be better to keep the 'original' data (e.g. "df_1") and create a new dataframe (e.g. "df_2") with the transformed values, i.e.
library(tidyverse)
# Create some fake data
set.seed(123)
names <- paste("SCORE", 1:1000, sep = ".")
IDs <- 1:100
m <- matrix(sample(1:20, 10000, replace = TRUE), ncol = 1000, nrow = 100,
dimnames=list(IDs, names))
df_1 <- as.data.frame(m)
head(df_1)
#> SCORE.1 SCORE.2 SCORE.3 SCORE.4 SCORE.5 SCORE.6 SCORE.7 SCORE.8 SCORE.9
#> 1 15 6 9 15 11 7 9 8 6
#> 2 19 16 16 19 15 4 16 20 4
#> 3 14 11 17 6 20 10 9 11 3
#> 4 3 4 13 16 2 17 2 18 14
#> 5 10 12 8 15 16 16 9 14 19
#> 6 18 14 7 19 19 8 11 3 14
# Transform the 'original' fake data into 'new' fake data
df_2 <- df_1 %>%
mutate(across(everything(), ~(.x - mean(.x) / sd(.x))))
head(df_2)
#> SCORE.1 SCORE.2 SCORE.3 SCORE.4 SCORE.5 SCORE.6 SCORE.7
#> 1 12.8991333 4.105098 7.164641 13.001316 9.2716116 5.25409 7.1758716
#> 2 16.8991333 14.105098 14.164641 17.001316 13.2716116 2.25409 14.1758716
#> 3 11.8991333 9.105098 15.164641 4.001316 18.2716116 8.25409 7.1758716
#> 4 0.8991333 2.105098 11.164641 14.001316 0.2716116 15.25409 0.1758716
#> 5 7.8991333 10.105098 6.164641 13.001316 14.2716116 14.25409 7.1758716
#> 6 15.8991333 12.105098 5.164641 17.001316 17.2716116 6.25409 9.1758716
Does this answer your question?