Home > Enterprise >  How can I write a dataframe to a csv after running scale() in R?
How can I write a dataframe to a csv after running scale() in R?

Time:09-17

I'm scaling one column in a dataset with the intention of fitting a linear model. However, when I try to write the dataframe (with scaled column) to a csv, it doesn't work because the scaled column became complex with center and scale attributes.

Can someone please indicate how to convert the scaled column to something that can write to a csv? (and maybe why scale() needs to do it this way.)

# make a data frame
testDF <- data.frame(x1 = c(1,2,2,3,2,4,4,5,6,15,36,42,11,12,23,24,25,66,77,18,9),
                     x2 = c(1,4,5,9,4,15,17,25,35,200,1297,1764,120,150,500,500,640,4200,6000,365,78))

# scale the x1 attribute  
testDF <- testDF %>%
  mutate(x1_scaled = scale(x1, center = TRUE, scale = TRUE))

# write to csv doesn't work
write_csv(as.matrix(testDF), "testDF.csv")

# but plotting and lm do work
ggplot(testDF, aes(x1_scaled))  
  geom_histogram(aes(y = ..density..),binwidth = 1)

Lm_scaled <- lm(x2 ~ x1_scaled, data = testDF)
plot(Lm_scaled)

CodePudding user response:

scale returns a matrix output. We could extract the column or use as.vector to remove the dim attribute

testDF <- testDF %>%
  mutate(x1_scaled = as.vector(scale(x1, center = TRUE, scale = TRUE)))

Check the structure of the output without as.vector and with as.vector

> testDF %>%
    mutate(x1_scaled = scale(x1, center = TRUE, scale = TRUE)) %>% str
'data.frame':   21 obs. of  3 variables:
 $ x1       : num  1 2 2 3 2 4 4 5 6 15 ...
 $ x2       : num  1 4 5 9 4 15 17 25 35 200 ...
 $ x1_scaled: num [1:21, 1] -0.824 -0.776 -0.776 -0.729 -0.776 ...
  ..- attr(*, "scaled:center")= num 18.4
  ..- attr(*, "scaled:scale")= num 21.2
> testDF %>%
    mutate(x1_scaled = as.vector(scale(x1, center = TRUE, scale = TRUE))) %>% str
'data.frame':   21 obs. of  3 variables:
 $ x1       : num  1 2 2 3 2 4 4 5 6 15 ...
 $ x2       : num  1 4 5 9 4 15 17 25 35 200 ...
 $ x1_scaled: num  -0.824 -0.776 -0.776 -0.729 -0.776 ...

CodePudding user response:

You can simply convert the scale column to numeric in base R and write out the dataframe:

testDF$x1_scaled <- as.numeric(testDF$x1_scaled)
write_csv(testDF, "testDF.csv")
  •  Tags:  
  • r
  • Related