I was doing some analysis of my power bill and was trying to extract how much the air conditioning unit is using. By assuming that "normal" power use is fairly constant over the year and that the A/C is only on during the hotter months, I was able to estimate some separation.
In the figure there are two lines:
- the linear showing "normal use", cumulative
- the sigmoidal showing AC use, cumulative
In trying to fit a good model through this total cumulative use, the nls function in R gave the dreaded singular gradient matrix at initial parameter estimates. Using different software (JMP) I could get good initial parameters, which are reflected in the purple fit curve.
Any idea how I can use R to get parameter estimates without the error?
nls model
nls(elec_use_mean ~ b0 b1 * month a1 / (a2 a3 * exp(a4 * month)),
start = list(b0 = 100, b1 = 310, a1 = 1300, a2 = 0.3, a3 = 600, a4 = -1.2),
data = cumulative
)
Error in nlsModel(formula, mf, start, wts, scaleOffset = scOff, nDcentral = nDcntr) :
singular gradient matrix at initial parameter estimates
Data:
month,elec_use_mean
1,461.46
2,839.46
3,1197.92
4,1553.59
5,2093.34
6,3096.42
7,4353.67
8,5652.51
9,6729.84
10,7296.92
11,7634.34
12,8071.84
CodePudding user response:
An alternative way to parameterize a sigmoid curve is to use tanh
, which can produce a nice converging model with only 4 parameters rather than 6. The formula can also be written in such a way as to allow the parameters to have clear meanings that lend themselves to close initial estimates:
elec_use_mean ~ b1 * month 6 * b0 * (1 tanh(a1 * (month - a0)))
where:
- b0 is the average monthly air conditioner spend over the year
- b1 is the average monthly background electricity spend over the year
- a0 is the month of the year when air conditioner use is highest
- a1 is a measure of how "seasonal" air con use is, where 0 is not seasonal at all (each month uses 1/12 of the air con total), and 1 means about 76% of air con use is in the two hottest months.
model <- nls(elec_use_mean ~ b1 * month 6 * b0 * (1 tanh(a1 * (month - a0))),
start = list(b0 = 330, b1 = 300, a0 = 7, a1 = 0.5),
data = cumulative
)
summary(model)
#>
#> Formula: elec_use_mean ~ b1 * month 6 * b0 * (1 tanh(a1 * (month -
#> a0)))
#>
#> Parameters:
#> Estimate Std. Error t value Pr(>|t|)
#> b0 294.99540 16.88446 17.47 1.18e-07 ***
#> b1 379.93397 16.34690 23.24 1.25e-08 ***
#> a0 7.07014 0.05877 120.31 2.55e-14 ***
#> a1 0.61313 0.05616 10.92 4.39e-06 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 70.82 on 8 degrees of freedom
#>
#> Number of iterations to convergence: 8
#> Achieved convergence tolerance: 2.64e-06
and we can see this fits well by generating a smooth set of predictions:
library(ggplot2)
new_df <- data.frame(month = seq(1, 12, 0.1))
new_df$elec_use_mean <- predict(model, new_df)
ggplot(cumulative, aes(month, elec_use_mean))
geom_point()
geom_line(data = new_df, linetype = 2)
scale_x_continuous(breaks = 1:12, labels = month.abb)