Let say I have below data
library(zoo)
Dates = seq(as.Date('2000-01-01'), as.Date('2005-12-31'), by = '6 months')
Data = rbind(data.frame(time = Dates, y = rnorm(length(Dates), 0, 10), month = as.factor(format(Dates, '%m')), type = 'A', M = log(12 0:11)),
data.frame(time = Dates, y = rnorm(length(Dates), 0, 10), month = as.factor(format(Dates, '%m')), type = 'B', M = log(3 0:11)),
data.frame(time = Dates, y = rnorm(length(Dates), 0, 10), month = as.factor(format(Dates, '%m')), type = 'C', M = log(2 0:11)),
data.frame(time = Dates[3:10], y = rnorm(8, 0, 10), month = as.factor(format(Dates[3:10], '%m')), type = 'D', M = log(10 0:7)))
XX = zoo(rt(length(Dates), 2, 0), Dates)
And a hypothetical model
y[t, type] = Beta[0] Beta[1] * xx[t] Beta[2] * type Beta[3] * month Beta[4] * M[t, type] error
I am trying to use lm()
function to estimate the parameters of above model, given the data, but not sure how to fit above equation in lm()
function.
Is it possible to use lm()
function for above model? What are other alternatives?
CodePudding user response:
This doesn't seem like a particularly unusual model specification. You want:
y[t, type] = Beta[0] Beta[1] * xx[t] Beta[2] * type
Beta[3] * month Beta[4] * M[t, type] error
Given the way your data are set up, you can think of this as indexing by i
:
y[t[i], type[i]] = ... Beta[1] * xx[t[i]] Beta[2] * type[i] ...
Beta[4]* M[t[i], type[i]] ...
Which corresponds to this formula in lm
(the 1
stands for the intercept/Beta[0]
term, which will be added by default in any case unless you add 0
or -1
to your formula).
y ~ 1 xx type month M
The one thing that doesn't match your desired specification is that, because type
is a categorical variable (factor) with more than two levels, there won't be a single parameter Beta[2]
: instead, R will internally convert type
to a series of (n_level-1)
dummy variables (search for questions/material about "contrasts" to understand this process) better).