Home > Software engineering >  Yeo-Johnson transformtion R
Yeo-Johnson transformtion R

Time:11-24

I am trying to run this code.

library(bestNormalize)

# Load data
data("cars", package = "datasets")

# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)

# Transform dependent variable using a maximum likelihood approach
yeojohnson(object = lm_cars, plotit = FALSE)

For the yeojohnson formula I get this error message:

Error in stopifnot(is.numeric(x)) : 
  argument "x" is missing, with no default

The variables in the lm function should both be numeric so I am unsure what I need to change. What do I have to do for the code to work?

CodePudding user response:

You appear to be trying to pass a linear regression model to yeojohnson. That doesn't work because (from help("yeojohnson")):

Perform a Yeo-Johnson Transformation and center/scale a vector to attempt normalization

Instead, you need to pass each column of the original data:

trans.cars <- sapply(cars,\(x)yeojohnson(x)$x.t)
trans.cars
#           speed        dist
# [1,] -2.12619875 -2.37119773
# [2,] -2.12619875 -1.53126570
# [3,] -1.57908522 -2.10053113
# [4,] -1.57908522 -0.76888016
# [5,] -1.39459473 -1.11361537
#...

From the Value section of help("yeojohnson"):

Value
A list of class yeojohnson with elements
x.t transformed original data

So you see that we will need to access the $x.t element of the list returned by yeojohnson.

Note that as @zx8754 mentioned in the comments, the first argument is x= not object=. However, you need not specify the argument at all.

Now you can run the linear model on the transformed data:

lm_cars <- lm(dist ~ speed, data = as.data.frame(trans.cars))
  •  Tags:  
  • r
  • Related