I am trying to run this code.
library(bestNormalize)
# Load data
data("cars", package = "datasets")
# Fit linear model
lm_cars <- lm(dist ~ speed, data = cars)
# Transform dependent variable using a maximum likelihood approach
yeojohnson(object = lm_cars, plotit = FALSE)
For the yeojohnson
formula I get this error message:
Error in stopifnot(is.numeric(x)) :
argument "x" is missing, with no default
The variables in the lm function should both be numeric so I am unsure what I need to change. What do I have to do for the code to work?
CodePudding user response:
You appear to be trying to pass a linear regression model to yeojohnson
. That doesn't work because (from help("yeojohnson")
):
Perform a Yeo-Johnson Transformation and center/scale a vector to attempt normalization
Instead, you need to pass each column of the original data:
trans.cars <- sapply(cars,\(x)yeojohnson(x)$x.t)
trans.cars
# speed dist
# [1,] -2.12619875 -2.37119773
# [2,] -2.12619875 -1.53126570
# [3,] -1.57908522 -2.10053113
# [4,] -1.57908522 -0.76888016
# [5,] -1.39459473 -1.11361537
#...
From the Value section of help("yeojohnson")
:
Value
A list of class yeojohnson with elements
x.t transformed original data
So you see that we will need to access the $x.t
element of the list returned by yeojohnson
.
Note that as @zx8754 mentioned in the comments, the first argument is x=
not object=
. However, you need not specify the argument at all.
Now you can run the linear model on the transformed data:
lm_cars <- lm(dist ~ speed, data = as.data.frame(trans.cars))