Home > Statistics > Nonlinear multilevel mixed-effects models

Nonlinear multilevel mixed-effects models

You have a model that is nonlinear in the parameters. Perhaps it is a model of tree growth and therefore asymptotes to a maximum value. Perhaps it is a model of serum concentrations of a drug that rise rapidly to a peak concentration and then decay exponentially. Easy enough, use nonlinear regression ([R] nl) to fit your model. But … what if you have repeated measures for each tree or repeated blood serum levels for each patient? You might want to account for the correlation within tree or patient. You might even believe that each tree has its own asymptotic growth. You need nonlinear mixed-effects models—also called nonlinear hierarchical models or nonlinear multilevel models.

The menl command, introduced in Stata 15, fits NLME models. Classical nonlinear models assume there is one observation per subject and that subjects are independent. You can think of NLME models as an extension of nonlinear models to the case where multiple measurements may be taken over a subject and these within-subject observations are usually correlated. You may also think of NLME models as a generalization of linear mixed-effects models where some or all random effects enter the model in a nonlinear fashion. Regardless of how you think of them, NLME models are used to describe a response variable as a (nonlinear) function of covariates, accounting for the correlation among observations on the same subject.

In this blog, I will walk you through several examples of fitting nonlinear mixed-effects models using menl. I will first fit a nonlinear model to the data on the growth of trees, ignoring the correlation among measurements on the same tree. I will then demonstrate different ways of accounting for this correlation and how to incorporate random effects into the model parameters to give the parameters tree-specific interpretation.

We have data (Draper and Smith 1998) on trunk circumference of five different orange trees where trunk circumference (in mm) was measured on seven different occasions, over roughly a four-year period of growth. We want to model the growth of orange trees. Let’s plot the data first.

. webuse orange
. twoway scatter circumf age, connect(L) ylabel(#6)

graph1

There is some variation in the growth curves, but the same overall shape is observed for all trees. In particular, the growth tends to level off toward the end. Pinheiro and Bates (2000) suggested the following nonlinear growth model for these data:

\[
{\mathsf{circumf}}_{ij}=\frac{\phi_{1}}{1+\exp\left\{ -\left( {\mathsf{age}}_{ij}-\phi_{2}\right)/\phi_{3}\right\}} + \epsilon_{ij}, \quad j=1,\dots,5; i=1,\dots,7
\]

Parameters of NLME models often have scientifically meaningful interpretations and form the basis of research questions. In this model, \(\phi_{1}\) is the average asymptotic trunk circumference of the trees as \(\mathsf{age}_{ij} \to \infty\), \(\phi_{2}\) is the average age at which the tree attains half the asymptotic trunk circumference \(\phi_{1}\) (half-life), and \(\phi_{3}\) is a scale parameter.

For now, let’s ignore the repercussions (less accurate estimates of uncertainty in parameter estimates, potentially less powerful tests of hypotheses, and less accurate interval estimates) of not accounting for the correlation and potential heterogeneity among the observations. Instead, we will treat all observations as being i.i.d.. We can use nl to fit the above model, but here we will use menl. (As of 15.1, menl no longer requires that you include random effects in the model specification.)

. menl circumf = {phi1}/(1+exp(-(age-{phi2})/{phi3})), stddeviations

Obtaining starting values:

NLS algorithm:

Iteration 0:   residual SS =  17480.234
Iteration 1:   residual SS =  17480.234

Computing standard errors:

Mixed-effects ML nonlinear regression           Number of obs     =         35
Log Likelihood = -158.39871
------------------------------------------------------------------------------
     circumf |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       /phi1 |   192.6876   20.24411     9.52   0.000     153.0099    232.3653
       /phi2 |   728.7564   107.2984     6.79   0.000     518.4555    939.0573
       /phi3 |   353.5337   81.47184     4.34   0.000     193.8518    513.2156
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
                sd(Residual) |   22.34805   2.671102      17.68079    28.24734
------------------------------------------------------------------------------

The stddeviations option instructs menl to report the standard deviation of the error term instead of variance. Because one growth curve is used for all trees, the individual differences as depicted in the above graph are incorporated in the residuals, thus inflating the residual standard deviation.

We have multiple observations from the same tree that are likely to be correlated. One way to account for the dependence between observations within the same tree is by including a random effect, \(u_j\), shared by all the observations within the \(j\)th tree. Thus, our model becomes

$$
\mathsf{circumf}_{ij}=\frac{\phi_{1}}{1+\exp\left\{ -\left( \mathsf{age}_{ij}-\phi_{2}\right)/\phi_{3}\right\}} + u_j + \epsilon_{ij}, \quad j=1,\dots,5; i=1,\dots,7
$$

This model may be fit by typing

. menl circumf = {phi1}/(1+exp(-(age-{phi2})/{phi3}))+{U[tree]}, stddev

Obtaining starting values by EM:

Alternating PNLS/LME algorithm:

Iteration 1:    linearization log likelihood = -147.631786
Iteration 2:    linearization log likelihood = -147.631786

Computing standard errors:

Mixed-effects ML nonlinear regression           Number of obs     =         35
Group variable: tree                            Number of groups  =          5

                                                Obs per group:
                                                              min =          7
                                                              avg =        7.0
                                                              max =          7
Linearization log likelihood = -147.63179
------------------------------------------------------------------------------
     circumf |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       /phi1 |   192.2526   17.06127    11.27   0.000     158.8131    225.6921
       /phi2 |   729.3642   68.05493    10.72   0.000      595.979    862.7494
       /phi3 |    352.405   58.25042     6.05   0.000     238.2363    466.5738
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
tree: Identity               |
                       sd(U) |   17.65093   6.065958      8.999985    34.61732
-----------------------------+------------------------------------------------
                sd(Residual) |    13.7099    1.76994      10.64497    17.65728
------------------------------------------------------------------------------

The fixed-effects estimates are similar in both models, but their standard errors are smaller in the above model. The residual standard deviation is also smaller because some of the individual differences are now being accounted for by the random effect. Let \(y_{sj}\) and \(y_{tj}\) be two observations on the \(j\)th tree. The inclusion of the random effect \(u_j\) implies that \({\rm Cov}(y_{sj},y_{tj}) > 0\) (in fact, it is equal to \({\rm Var}(u_j)\)). Therefore, \(u_j\) induces a positive correlation among any two observations within the same tree.

There is another way to account for the correlation among observations within a group. You can model the residual covariance structure explicitly using menl‘s rescovariance() option. For example, the above random-intercept nonlinear model is equivalent to a nonlinear marginal model with an exchangeable covariance matrix. Technically, the two models are equivalent only when the correlation of the within-group observations is positive.

. menl circumf = {phi1}/(1+exp(-(age-{phi2})/{phi3})),
    rescovariance(exchangeable, group(tree)) stddev

Obtaining starting values:

Alternating GNLS/ML algorithm:

Iteration 1:    log likelihood = -147.632441
Iteration 2:    log likelihood = -147.631786
Iteration 3:    log likelihood = -147.631786
Iteration 4:    log likelihood = -147.631786

Computing standard errors:

Mixed-effects ML nonlinear regression           Number of obs     =         35
Group variable: tree                            Number of groups  =          5

                                                Obs per group:
                                                              min =          7
                                                              avg =        7.0
                                                              max =          7
Log Likelihood = -147.63179
------------------------------------------------------------------------------
     circumf |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       /phi1 |   192.2526   17.06127    11.27   0.000     158.8131    225.6921
       /phi2 |   729.3642   68.05493    10.72   0.000      595.979    862.7494
       /phi3 |    352.405   58.25042     6.05   0.000     238.2363    466.5738
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
Residual: Exchangeable       |
                          sd |   22.34987    4.87771      14.57155    34.28026
                        corr |   .6237137   .1741451      .1707327    .8590478
------------------------------------------------------------------------------

The fixed-effects portion of the output is identical in the two models. These two models also produce the same estimates of the marginal covariance matrix. You can verify this by running estat wcorrelation after each model. Although the two models are equivalent, by including a random effect on trees, we can not only account for the dependence of observations within trees, but also estimate between-tree variability and predict tree-specific effects after estimation.

The previous two models imply that the growth curves for all trees have the same shape and differ only by a vertical shift that equals \(u_j\). This may be too strong an assumption in practice. If we look back at the graph, we will notice there is an increasing variability in the trunk circumferences of trees as they approach their limiting height. Therefore, it may be more reasonable to allow \(\phi_1\) to vary between trees. This will also give our parameters of interest a tree-specific interpretation. That is, we assume

$$
\mathsf{circumf}_{ij}=\frac{\phi_{1}}{1+\exp\left\{ -\left( \mathsf{age}_{ij}-\phi_{2}\right)/\phi_{3}\right\}} + \epsilon_{ij}
$$

with

$$
\phi_1 = \phi_{1j} = \beta_1 + u_j
$$

You may interpret \(\beta_1\) as the average asymptotic height and \(u_j\) as a random deviation from the average that is specific to the \(j\)th tree. The above model may be fit as follows:

. menl circumf = ({b1}+{U1[tree]})/(1+exp(-(age-{phi2})/{phi3}))

Obtaining starting values by EM:

Alternating PNLS/LME algorithm:

Iteration 1:    linearization log likelihood = -131.584579

Computing standard errors:

Mixed-effects ML nonlinear regression           Number of obs     =         35
Group variable: tree                            Number of groups  =          5

                                                Obs per group:
                                                              min =          7
                                                              avg =        7.0
                                                              max =          7
Linearization log likelihood = -131.58458
------------------------------------------------------------------------------
     circumf |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         /b1 |    191.049   16.15403    11.83   0.000     159.3877    222.7103
       /phi2 |    722.556   35.15082    20.56   0.000     653.6616    791.4503
       /phi3 |   344.1624   27.14739    12.68   0.000     290.9545    397.3703
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
tree: Identity               |
                     var(U1) |   991.1514   639.4636      279.8776    3510.038
-----------------------------+------------------------------------------------
               var(Residual) |   61.56371   15.89568      37.11466    102.1184
------------------------------------------------------------------------------

If desired, we can additionally model the within-tree error covariance structure using, for example, an exchangeable structure:

. menl circumf = ({b1}+{U1[tree]})/(1+exp(-(age-{phi2})/{phi3})),
    rescovariance(exchangeable)

Obtaining starting values by EM:

Alternating PNLS/LME algorithm:

Iteration 1:    linearization log likelihood = -131.468559
Iteration 2:    linearization log likelihood = -131.470388
Iteration 3:    linearization log likelihood = -131.470791
Iteration 4:    linearization log likelihood = -131.470813
Iteration 5:    linearization log likelihood = -131.470813

Computing standard errors:

Mixed-effects ML nonlinear regression           Number of obs     =         35
Group variable: tree                            Number of groups  =          5

                                                Obs per group:
                                                              min =          7
                                                              avg =        7.0
                                                              max =          7
Linearization log likelihood = -131.47081
------------------------------------------------------------------------------
     circumf |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         /b1 |   191.2005   15.59015    12.26   0.000     160.6444    221.7566
       /phi2 |   721.5232   35.66132    20.23   0.000     651.6283    791.4182
       /phi3 |   344.3675   27.20839    12.66   0.000     291.0401     397.695
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
tree: Identity               |
                     var(U1) |   921.3895    582.735      266.7465    3182.641
-----------------------------+------------------------------------------------
Residual: Exchangeable       |
                         var |   54.85736   14.16704      33.06817    91.00381
                         cov |  -9.142893   2.378124     -13.80393   -4.481856
------------------------------------------------------------------------------

We did not specify option group() in the above because, in the presence of random effects, rescovariance() automatically determines the lowest-level group based on the specified random effects.

The NLME models we used so far are all linear in the random effect. In NLME models, random effects can enter the model nonlinearly, just like the fixed effects, and they often do. For instance, in addition to \(\phi_1\), we can let other parameters vary between trees and have their own random effects:

. menl circumf = {phi1:}/(1+exp(-(age-{phi2:})/{phi3:})),
    define(phi1:{b1}+{U1[tree]})
    define(phi2:{b2}+{U2[tree]})
    define(phi3:{b3}+{U3[tree]})
(output omitted)

The above model is too complicated for our data containing only five trees and is included purely for demonstration purposes. But from this example, you can see that the define() option is useful when you have a complicated nonlinear expression, and you would like to break it down into smaller pieces. Parameters specified within define() may also be predicted for each tree after estimation. For example, after we fit our model, we may wish to predict the asymptotic height, \(\phi_{1j}\), for each tree \(j\). Below, we are requesting to construct a variable named phi1 that contains the predicted values for the expression {phi1:}:

. predict (phi1 = {phi1:})
(output omitted)

If we had more trees, we could have also specified a covariance structure for these random effects as follows:

. menl circumf = {phi1:}/(1+exp(-(age-{phi2:})/{phi3:})),
    define(phi1:{b1}+{U1[tree]})
    define(phi2:{b2}+{U2[tree]})
    define(phi3:{b3}+{U3[tree]})
    covariance(U1 U2 U3, unstructured)
(output omitted)

I demonstrated only a few models in this article, but menl can do much more. For example, menl can incorporate higher levels of nesting such as three-level models (for example, repeated observations nested within trees and trees nested within planting zones). See [ME] menl for more examples.

References

Draper, N., and H. Smith. 1998. Applied Regression Analysis. 3rd ed. New York: Wiley.

Pinheiro, J. C., and D. M. Bates. 2000. Mixed-Effects Models in S and S-PLUS. New York: Springer.