Multiple equation models: Estimation and marginal effects using mlexp
We continue with the series of posts where we illustrate how to obtain correct standard errors and marginal effects for models with multiple steps. In this post, we estimate the marginal effects and standard errors for a hurdle model with two hurdles and a lognormal outcome using mlexp. mlexp allows us to estimate parameters for multiequation models using maximum likelihood. In the last post (Multiple equation models: Estimation and marginal effects using gsem), we used gsem to estimate marginal effects and standard errors for a hurdle model with two hurdles and an exponential mean outcome.
We exploit the fact that the hurdle-model likelihood is separable and the joint log likelihood is the sum of the individual hurdle and outcome log likelihoods. We estimate the parameters of each hurdle and the outcome separately to get initial values. Then, we use mlexp to estimate the parameters of the model and margins to obtain marginal effects.
Starting points: model and initial values
We model the amount spent on dental care. The first hurdle is whether an individual spends or does not spend on dental care. The second hurdle is the individual level of insurance coverage. Different levels of coverage lead to different spending on dental care.
We assume probit and ordered probit models for the two hurdles. In contrast to the previous post, we use a lognormal distribution to model the amount spent. With these distributional assumptions, we use maximum likelihood rather than quasi-likelihood, as in the previous post, to estimate the parameters of the model.
We obtain initial values by mlexp for each hurdle and the lognormal outcome and store the log-likelihood expression for each step in a local macro. These local macros are summed together in the final use of mlexp to get parameter estimates and standard errors.
spend is a binary outcome for whether an individual spends money on dental care. This is the first hurdle variable. We store the log-likelihood expression in the local macro probit. See Appendix for more information. Then, we use mlexp to estimate the parameters of the hurdle. The point estimates are stored in the matrix binit.
. local probit ln(cond(1.spend,normal({spend: x1 x2 x4 _cons}),   
>                 1-normal({spend:})))
. mlexp (`probit')
initial:       log likelihood = -6931.4718
alternative:   log likelihood = -5926.3721
rescale:       log likelihood = -5926.3721
Iteration 0:   log likelihood = -5926.3721
Iteration 1:   log likelihood = -4789.4607
Iteration 2:   log likelihood = -4776.9361
Iteration 3:   log likelihood = -4776.9332
Iteration 4:   log likelihood = -4776.9332
Maximum likelihood estimation
Log likelihood = -4776.9332                   Number of obs     =     10,000
----------------------------------------------------------------------------
           |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------+----------------------------------------------------------------
spend      |
        x1 |   .5138169   .0160374    32.04   0.000     .4823841    .5452498
        x2 |  -.5276993   .0224236   -23.53   0.000    -.5716488   -.4837498
        x4 |   .4822064   .0185374    26.01   0.000     .4458736    .5185391
     _cons |   .5568866     .02899    19.21   0.000     .5000672     .613706
----------------------------------------------------------------------------
. matrix binit = e(b)
insurance is a three-level ordered outcome indicating insurance level. This is the variable for the second hurdle. We store the log-likelihood expression in the local macro oprobit and use mlexp as before.
. local oprobit ln(cond(insurance==1,normal(-{insurance: x3 x4}+{cut1}),  
>                 cond(insurance==2,normal({cut2}-{insurance:})-          
>                         normal({cut1}-{insurance:}),                    
>                         1-normal({cut2}-{insurance:}))))
. mlexp (`oprobit')
initial:       log likelihood =     -  (could not be evaluated)
feasible:      log likelihood = -23924.936
rescale:       log likelihood = -19788.939
rescale eq:    log likelihood = -11884.962
Iteration 0:   log likelihood = -11884.962
Iteration 1:   log likelihood = -10261.611
Iteration 2:   log likelihood = -10227.115
Iteration 3:   log likelihood = -10226.993
Iteration 4:   log likelihood = -10226.993
Maximum likelihood estimation
Log likelihood = -10226.993                   Number of obs     =     10,000
----------------------------------------------------------------------------
           |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------+----------------------------------------------------------------
insurance  |
        x3 |   .2926129   .0100288    29.18   0.000     .2729569     .312269
        x4 |  -.2754986   .0144941   -19.01   0.000    -.3039066   -.2470906
-----------+----------------------------------------------------------------
     /cut1 |  -1.270488   .0255784   -49.67   0.000    -1.320621   -1.220355
     /cut2 |  -.2612825   .0235687   -11.09   0.000    -.3074763   -.2150887
----------------------------------------------------------------------------
. matrix binit = binit, e(b)
 
Now, we use mlexp to estimate the parameters of the lognormal outcome. spent corresponds to the amount spent on dental care. We use factor-variable notation to use a different intercept for each level of insurance, \(\beta_{ins,1}\ldots,\beta_{ins,3}\). The covariates are specified in equation spent, and the constant intercepts are specified in spent_int. The log-likelihood expression is saved in the local macro lognormal. We restrict estimation to the positive sample and use mlexp to estimate the outcome parameters.
. local lognormal -.5*((ln(spent)-{spent: x4 x5 x6} -     
>         {spent_int: ibn.insurance})/                    
>         (exp({lnsigma})))^2                             
>         -ln((spent*exp({lnsigma})*sqrt(2*_pi)))
. mlexp (`lognormal') if spend
initial:       log likelihood = -16596.787
alternative:   log likelihood = -16544.473
rescale:       log likelihood = -15515.652
rescale eq:    log likelihood = -14206.308
Iteration 0:   log likelihood = -14206.308
Iteration 1:   log likelihood =  -13818.45
Iteration 2:   log likelihood = -13520.664
Iteration 3:   log likelihood = -13519.085
Iteration 4:   log likelihood = -13519.084
Maximum likelihood estimation
Log likelihood = -13519.084                   Number of obs     =      7,228
----------------------------------------------------------------------------
           |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------+----------------------------------------------------------------
spent      |
        x4 |   .2953067   .0121253    24.35   0.000     .2715415    .3190718
        x5 |  -.2917276   .0088451   -32.98   0.000    -.3090637   -.2743915
        x6 |   .2891539   .0227041    12.74   0.000     .2446548    .3336531
-----------+----------------------------------------------------------------
spent_int  |
 insurance |
        1  |   .1924971   .0239883     8.02   0.000     .1454809    .2395134
        2  |   .2950143   .0214191    13.77   0.000     .2530336    .3369951
        3  |    .454055   .0202381    22.44   0.000      .414389    .4937209
-----------+----------------------------------------------------------------
  /lnsigma |  -.2953913   .0083172   -35.52   0.000    -.3116926   -.2790899
----------------------------------------------------------------------------
. matrix binit = binit, e(b)
Joint Estimation and marginal effects
Now, we use mlexp to estimate the parameters of the joint model. The joint log likelihood is specified as the sum of the individual log likelihoods. We merely add up the local macros that we created in the last section. The matrix binit contains the point estimates from the individual steps. We specify this matrix in from() to give mlexp good starting values. We specify vce(robust) so that we can use margins to estimate the marginal effects over the population of covariates using margins and vce(unconditional).
 mlexp (`probit' + `oprobit' + cond(spend,(`lognormal'),0)), 
>         vce(robust) from(binit)
Iteration 0:   log pseudolikelihood =  -28523.01
Iteration 1:   log pseudolikelihood =  -28523.01
Maximum likelihood estimation
Log pseudolikelihood =  -28523.01             Number of obs     =     10,000
----------------------------------------------------------------------------
           |               Robust
           |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------+----------------------------------------------------------------
spend      |
        x1 |    .513817   .0159257    32.26   0.000     .4826032    .5450307
        x2 |  -.5276994   .0224874   -23.47   0.000    -.5717739   -.4836249
        x4 |   .4822064   .0185804    25.95   0.000     .4457894    .5186234
     _cons |   .5568866   .0289611    19.23   0.000     .5001239    .6136494
-----------+----------------------------------------------------------------
insurance  |
        x3 |   .2926129   .0100343    29.16   0.000     .2729461    .3122798
        x4 |  -.2754986   .0144074   -19.12   0.000    -.3037366   -.2472605
-----------+----------------------------------------------------------------
cut1       |
     _cons |  -1.270488   .0254071   -50.01   0.000    -1.320285   -1.220691
-----------+----------------------------------------------------------------
cut2       |
     _cons |  -.2612825   .0236404   -11.05   0.000    -.3076167   -.2149482
-----------+----------------------------------------------------------------
spent      |
        x4 |   .2953067       .012    24.61   0.000      .271787    .3188263
        x5 |  -.2917276   .0088412   -33.00   0.000     -.309056   -.2743991
        x6 |   .2891539   .0224396    12.89   0.000     .2451731    .3331347
-----------+----------------------------------------------------------------
spent_int  |
 insurance |
        1  |   .1924972   .0242748     7.93   0.000     .1449194     .240075
        2  |   .2950143    .021299    13.85   0.000      .253269    .3367597
        3  |   .4540549   .0202593    22.41   0.000     .4143473    .4937625
-----------+----------------------------------------------------------------
  /lnsigma |  -.2953912   .0084227   -35.07   0.000    -.3118993    -.278883
----------------------------------------------------------------------------
The point estimates from the individual steps match the joint estimates. Now, we obtain marginal effects of the outcome covariates on the conditional mean of the outcome.
The conditional mean of expenditure on the independent variables is given by
\begin{eqnarray*}
E\left(\text{spent}|X\right) = \Phi(X_p{\boldsymbol \beta}_p)
\left[\begin{matrix}\Phi(\kappa_1-X_o{\boldsymbol \beta}_o)\exp{\beta}_{{ins},1} + \cr
\left\{\Phi(\kappa_2-X_o{\boldsymbol \beta}_o) – \Phi(\kappa_1-X_o{\boldsymbol \beta}_o)\right\} \exp{ \beta}_{{ins},2} + \cr \left\{1-\Phi(\kappa_2-X_o{\boldsymbol \beta}_o)\right\} \exp{ \beta}_{{ins},3}\end{matrix}\right] \exp\left(X_e{\boldsymbol \beta}_e + .5\sigma^2\right)\cr
\end{eqnarray*}
where \(\kappa_{1}\) and \(\kappa_{2}\) are the cutoff points from the insurance model, which represents insurance level.  We use the subscripts p, o, and e to emphasize that the covariates and coefficients related to the probit, ordered probit, and exponential mean are different.
Now, we use margins to estimate the marginal effects. We use the expression() option to write an expression for the expected value of amount spent.
. margins, expression(normal(xb(spend))*(                 
>         exp(_b[spent_int:1.insurance])*                 
>                 normal(_b[/cut1]-xb(insurance))+        
>         exp(_b[spent_int:2.insurance])*                 
>                 (normal(_b[/cut2]-xb(insurance)) -      
>                 normal(_b[/cut1]-xb(insurance)))+      
>         exp(_b[spent_int:3.insurance])*(                
>                 1-normal(_b[/cut2]-xb(insurance))))*    
>         exp(xb(spent)+.5*exp(_b[/lnsigma])^2))         
>         dydx(x4 x5 x6) vce(unconditional)
Average marginal effects                     Number of obs     =     10,000
Expression   : normal(xb(spend))*( exp(_b[spent_int:1.insurance])*
               normal(_b[/cut1]-xb(insurance))+
               exp(_b[spent_int:2.insurance])*
               (normal(_b[/cut2]-xb(insurance)) -
               normal(_b[/cut1]-xb(insurance)))+
               exp(_b[spent_int:3.insurance])*(
               1-normal(_b[/cut2]-xb(insurance))))*
               exp(xb(spent)+.5*exp(_b[/lnsigma])^2)
dy/dx w.r.t. : x4 x5 x6
----------------------------------------------------------------------------
           |            Unconditional
           |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------+----------------------------------------------------------------
        x4 |   .9525996   .0311351    30.60   0.000     .8915759    1.013623
        x5 |  -.6329119   .0224351   -28.21   0.000    -.6768838     -.58894
        x6 |   .6273282   .0499249    12.57   0.000     .5294772    .7251792
----------------------------------------------------------------------------
Final considerations
We illustrated how to use mlexp to obtain the estimates and standard errors for a multiple hurdle model and its marginal effects. In subsequent posts, we obtain these results for other multistep models using other Stata tools.
Appendix 1
Below is the code used to produce the data.
clear set seed 11134 set obs 10000 // Generating exogenous variables generate x1 = rnormal() generate x2 = int(3*rbeta(2,3)) generate x3 = rchi2(1)-2 generate x4 = ln(rchi2(4)) generate x5 = rnormal() generate x6 = rbeta(2,3)>.6 // Generating unobservables generate ep = rnormal() // for probit generate eo = rnormal() // for ordered probit generate e = rnormal()*.75 // for lognormal equation // Generating linear predictions generate xbp = .5*(1 + x1 - x2 + x4) generate xbo = .3*(1 + x3 - x4) generate yotemp = xbo + eo generate insurance = yotemp replace insurance = 1 if yotemp < -1 replace insurance = 2 if yotemp> -1 & yotemp<0 replace insurance = 3 if yotemp> 0 generate xbe = .3*(- x5 + x6 + x4 + .5*insurance) // Generating outcomes generate spend = xbp + ep > 0 generate yexp = exp(xbe + e) generate spent = spend*yexp
Appendix 2
Macros are a programming feature provided by Stata. They can be used to make complicated expressions easy to write. Everywhere a punctuated macro name appears in a command, the macro contents are substituted for the macro name. So a complicated expression can be stored in a macro with a short name, and the expression can be used repeatedly by only typing the punctuated short name. See Programming an estimation command in Stata: Global macros versus local macros for a discussion of macros.
In this blog, we use local macros to store expressions used in calculating the log likelihood of our model. For the probit model, we stored the log-likelihood expression in the local macro probit.
. local probit 
> ln(cond(1.spend,normal({spend: x1 x2 x4 _cons}),1-normal({spend:})))
When we type the punctuated `probit’ later, Stata uses the expression stored inside probit. We can see the expression by displaying it.
. display "`probit'"
ln(cond(1.spend,normal({spend: x1 x2 x4 _cons}),1-normal({spend:})))