I'm scaling one column in a dataset with the intention of fitting a linear model. However, when I try to write the dataframe (with scaled column) to a csv, it doesn't work because the scaled column became complex with center and scale attributes.
Can someone please indicate how to convert the scaled column to something that can write to a csv? (and maybe why scale() needs to do it this way.)
# make a data frame
testDF <- data.frame(x1 = c(1,2,2,3,2,4,4,5,6,15,36,42,11,12,23,24,25,66,77,18,9),
x2 = c(1,4,5,9,4,15,17,25,35,200,1297,1764,120,150,500,500,640,4200,6000,365,78))
# scale the x1 attribute
testDF <- testDF %>%
mutate(x1_scaled = scale(x1, center = TRUE, scale = TRUE))
# write to csv doesn't work
write_csv(as.matrix(testDF), "testDF.csv")
# but plotting and lm do work
ggplot(testDF, aes(x1_scaled))
geom_histogram(aes(y = ..density..),binwidth = 1)
Lm_scaled <- lm(x2 ~ x1_scaled, data = testDF)
plot(Lm_scaled)
CodePudding user response:
scale
returns a matrix
output. We could extract the column or use as.vector
to remove the dim
attribute
testDF <- testDF %>%
mutate(x1_scaled = as.vector(scale(x1, center = TRUE, scale = TRUE)))
Check the str
ucture of the output without as.vector
and with as.vector
> testDF %>%
mutate(x1_scaled = scale(x1, center = TRUE, scale = TRUE)) %>% str
'data.frame': 21 obs. of 3 variables:
$ x1 : num 1 2 2 3 2 4 4 5 6 15 ...
$ x2 : num 1 4 5 9 4 15 17 25 35 200 ...
$ x1_scaled: num [1:21, 1] -0.824 -0.776 -0.776 -0.729 -0.776 ...
..- attr(*, "scaled:center")= num 18.4
..- attr(*, "scaled:scale")= num 21.2
> testDF %>%
mutate(x1_scaled = as.vector(scale(x1, center = TRUE, scale = TRUE))) %>% str
'data.frame': 21 obs. of 3 variables:
$ x1 : num 1 2 2 3 2 4 4 5 6 15 ...
$ x2 : num 1 4 5 9 4 15 17 25 35 200 ...
$ x1_scaled: num -0.824 -0.776 -0.776 -0.729 -0.776 ...
CodePudding user response:
You can simply convert the scale column to numeric in base R and write out the dataframe:
testDF$x1_scaled <- as.numeric(testDF$x1_scaled)
write_csv(testDF, "testDF.csv")