Home > other >  Advice needed in how to implement the scale() function in R
Advice needed in how to implement the scale() function in R

Time:04-23

Context

As part of my studies in ecology, I am trying to stick to a quantile regression method performed in Karlsson et. al (2022) (source: https://arxiv.org/pdf/2202.02206.pdf). As I am facing some unexpected challenges (more insights here if interested: https://stats.stackexchange.com/questions/572277/theory-understanding-behind-quantile-regression), I am trying to reproduce as much as I can their methodology, in order to find some solutions to my problems.

Here is how I assume that their dataset has been built ( some covariates columns that I will not take into account):

individual year julian_day
1 y(1) j(1)
... ... ...
n y(n) j(n)

Basically, 1 row = 1 bird, the year it was recorded and the day it was recorded (julian day format).

I am trying to reproduce a sentence that I first ignored because I thought it was not as a matter of importance:

For all fitted models, year was centered around 2001.

It is the only information that they provide in the paper about centering the year column.

Question

The scale() function, by default, subtracts the mean from each individual observation and then divides by the standard deviation. By specifying scale=FALSE, we tell R not to divide by the standard deviation.

My question is straightforward: as nothing else is specified, what is the "common use" of the scale() function? Regarding the quoted sentence, what would you think that the scale argument should be: FALSE or TRUE; and why?

Thank you very much for your help.

CodePudding user response:

I think you want to use in your model I(year - 2001). That will centre the year around 2001. scale() will centre around the mean, which may or may not be 2001 depending on the data. If scale=FALSE only centring is done. If scale=TRUE, then the resulting centred variable is divided by its standard deviation.

  • Related