Estimating covariate effects after gmm

Home > Statistics > Estimating covariate effects after gmm

Estimating covariate effects after gmm

4 October 2016 Charles Lindsey, Senior Statistician and Software Developer Go to comments

In Stata 14.2, we added the ability to use margins to estimate covariate effects after gmm. In this post, I illustrate how to use margins and marginsplot after gmm to estimate covariate effects for a probit model.

Margins are statistics calculated from predictions of a previously fit model at fixed values of some covariates and averaging or otherwise integrating over the remaining covariates. They can be used to estimate population average parameters like the marginal mean, average treatment effect, or the average effect of a covariate on the conditional mean. I will demonstrate how using margins is useful after estimating a model with the generalized method of moments.

Probit model

For binary outcome \(y_i\) and regressors \({\bf x}_i\), the probit model assumes

\begin{equation}
y_i = {\bf 1}({\bf x}_i{\boldsymbol \beta} + \epsilon_i > 0) \nonumber
\end{equation}

where the error \(\epsilon_i\) is standard normal. The indicator function \({\bf 1}(\cdot)\) outputs 1 when its input is true and outputs 0 otherwise.

The conditional mean of \(y_i\) is

\begin{equation}
E(y_i\vert{\bf x}_i) = \Phi({\bf x}_i{\boldsymbol \beta}) \nonumber
\end{equation}

We can use the generalized method of moments (GMM) to estimate \({\boldsymbol \beta}\), with sample moment conditions

\begin{equation}
\sum_{i=1}^N \left[\left\{ y_i \frac{\phi\left({\bf x}_i{\boldsymbol \beta}\right)}{\Phi\left({\bf x}_i{\boldsymbol \beta}\right)} – (1-y_i)
\frac{\phi\left({\bf x}_i{\boldsymbol \beta}\right)}{\Phi\left(-{\bf x}_i{\boldsymbol \beta}\right)}\right\} {\bf x}_i\right] ={\bf 0} \nonumber
\end{equation}

In addition to the model parameters \({\boldsymbol \beta}\), we may also be interested in the change in \(y_i\) as we change one of the covariates in \({\bf x}_i\). How do individuals that only differ in the value of one of the regressors compare?

Suppose we want to compare differences in the regressor \(x_{ij}\). The vector \({\bf x}_{i}^{\star}\) is \({\bf x}_{i}\) with the \(j\)th regressor \(x_{ij}\) replaced by \(x_{ij}+1\).

The effect of a unit change in \(x_{ij}\) on \(y_i\) at \({\bf x}_i\) is

\begin{eqnarray*}
E(y_i\vert {\bf x}_i^{\star}) – E(y_i\vert {\bf x}_i) = \Phi\left({\bf x}_i^{\star}{\boldsymbol \beta}\right) – \Phi\left({\bf x}_i{\boldsymbol \beta}\right)
\end{eqnarray*}

If we wanted to estimate how this effect changed over the population, we could add the following sample moment condition to our GMM estimation

\begin{equation}
\sum_{i=1}^N \delta – \left[ \Phi\left({\bf x}_i^{\star}{\boldsymbol \beta}\right) – \Phi\left({\bf x}_i{\boldsymbol \beta}\right)\right] ={\bf 0} \nonumber
\end{equation}

This condition implies

\begin{equation}
\delta = \frac{1}{N}\sum_{i=1}^N \Phi\left({\bf x}_i^{\star}{\boldsymbol \beta}\right) – \Phi\left({\bf x}_i{\boldsymbol \beta}\right) ={\bf 0} \nonumber
\end{equation}

Rather than using the condition for \(\delta\) in the GMM estimation, we can directly calculate the sample average of the effect after estimation.

\begin{equation}
\hat{\delta} = \frac{1}{N} \sum_{i=1}^N \Phi\left({\bf x}_i^{\star}\widehat{\boldsymbol \beta}\right) – \Phi\left({\bf x}_i\widehat{\boldsymbol \beta}\right) \nonumber
\end{equation}

The standard error for this mean effect needs to be adjusted for the estimation of \({\boldsymbol \beta}\). We can use gmm to estimate \({\boldsymbol \beta}\) and then use margins to estimate \(\delta\) and its properly adjusted standard error. This provides flexibility. You can estimate a model with few moment conditions and then estimate multiple margins.

Covariate effects

We estimate the mean effects for a probit regression model using gmm and margins from simulated data. We regress the binary \(y_i\) on binary \(d_i\) and continuous \(x_i\) and \(z_i\). A quadratic term for \(x_i\) is included in the model, and we interact both powers of \(x_i\) and \(z_i\) with \(d_i\).

First, we use gmm to estimate \({\boldsymbol \beta}\). Factor-variable notation is used to specify the quadratic power of \(x_i\) and the interactions of the powers of \(x_i\) and \(z_i\) with \(d_i\).

. gmm (cond(y,normalden({y: i.d##(c.x c.x#c.x c.z) i.d _cons})/   
>         normal({y:}),-normalden({y:})/normal(-{y:}))),  
>         instruments(i.d##(c.x c.x#c.x c.z) i.d) onestep

Step 1
Iteration 0:   GMM criterion Q(b) =  .26129294
Iteration 1:   GMM criterion Q(b) =  .01621062
Iteration 2:   GMM criterion Q(b) =  .00206357
Iteration 3:   GMM criterion Q(b) =  .00033537
Iteration 4:   GMM criterion Q(b) =  4.916e-06
Iteration 5:   GMM criterion Q(b) =  1.539e-08
Iteration 6:   GMM criterion Q(b) =  3.361e-13

note: model is exactly identified

GMM estimation

Number of parameters =   8
Number of moments    =   8
Initial weight matrix: Unadjusted                 Number of obs   =      5,000

------------------------------------------------------------------------------
             |               Robust
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         1.d |   1.752056   .0987097    17.75   0.000     1.558588    1.945523
           x |   .2209241   .0311227     7.10   0.000     .1599247    .2819235
             |
     c.x#c.x |  -.2864622   .0199842   -14.33   0.000    -.3256305   -.2472939
             |
           z |  -.6813765   .0558371   -12.20   0.000    -.7908152   -.5719379
             |
       d#c.x |
          1  |    .311213   .0543018     5.73   0.000     .2047835    .4176426
             |
   d#c.x#c.x |
          1  |  -.7297855   .0513903   -14.20   0.000    -.8305086   -.6290624
             |
       d#c.z |
          1  |  -.4272026   .0807842    -5.29   0.000    -.5855368   -.2688684
             |
       _cons |   .1180114   .0520303     2.27   0.023     .0160339    .2199888
------------------------------------------------------------------------------
Instruments for equation 1: 0b.d 1.d x c.x#c.x z 0b.d#co.x 1.d#c.x 0b.d#co.x#co.x 1.d#c.x#c.x 0b.d#co.z
    1.d#c.z _cons

Now, we use margins to estimate the mean effect of changing \(x_i\) to \(x_i+1\). We specify vce(unconditional) to estimate the mean effect over the population of \(x_i\), \(z_i\), and \(d_i\). The normal probability expression is specified in the expression() option. The expression function xb() is used to get the linear prediction. We specify the at(generate()) option and atcontrast(r) under the contrast option so that the expression at \(x_i\) will be subtracted from the expression at \(x_i+1\). nowald is specified to suppress the Wald test of the contrast.

. margins, at(x=generate(x)) at(x=generate(x+1)) vce(unconditional) 
>         expression(normal(xb())) contrast(atcontrast(r) nowald)

Contrasts of predictive margins

Expression   : normal(xb())

1._at        : x               = x

2._at        : x               = x+1

--------------------------------------------------------------
             |            Unconditional
             |   Contrast   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
         _at |
   (2 vs 1)  |  -.0108121   .0040241     -.0186993    -.002925
--------------------------------------------------------------

Unit changes are particularly useful for evaluating the effect of discrete covariates. When a discrete covariate is specified using factor-variable notation, we can use contrast notation in margins to estimate the covariate effect.

We estimate the mean effect of changing from \(d_i=0\) to \(d_i=1\) over the population of covariates with margins. We specify the contrast r.d and the conditional mean in the expression() option. The expression will be evaluated at \(d_i=0\) and then subtracted from the expression evaluated at \(d_i=1\). We specify contrast(nowald) to suppress the Wald test of the contrast.

. margins r.d, expression(normal(xb())) vce(unconditional) contrast(nowald)

Contrasts of predictive margins

Expression   : normal(xb())

--------------------------------------------------------------
             |            Unconditional
             |   Contrast   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
           d |
   (1 vs 0)  |   .1370625   .0093206      .1187945    .1553305
--------------------------------------------------------------

So on average over the population, changing from \(d_i=0\) to \(d_i=1\) and keeping other covariates constant will increase the probability of success by 0.14.

Graphing covariate effects

We have used margins to estimate the mean covariate effect over the population of covariates. We can also use margins to estimate covariate effects at fixed values of the other covariates or to average the covariate effect over certain covariates while fixing others. We may examine multiple effects to find a pattern. The marginsplot command graphs effects estimated by margins and can be helpful in these situations.

Suppose we wanted to see how the effect of a unit change in \(d_i\) varied over \(x_i\). We can use margins with the at() option to estimate the effect at different values of \(x_i\), averaged over the other covariates. We suppress the legend of fixed covariate values by specifying noatlegend.

. margins r.d, at(x = (-1 -.5 0 .5 1 1.5 2))      
>         expression(normal(xb())) noatlegend     
>         vce(unconditional) contrast(nowald)

Contrasts of predictive margins

Expression   : normal(xb())

--------------------------------------------------------------
             |            Unconditional
             |   Contrast   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
       d@_at |
 (1 vs 0) 1  |   .1536608   .0230032      .1085753    .1987462
 (1 vs 0) 2  |   .3446265   .0184594      .3084468    .3808062
 (1 vs 0) 3  |   .3907978    .017575      .3563515    .4252441
 (1 vs 0) 4  |   .3802466    .017735      .3454866    .4150066
 (1 vs 0) 5  |   .3166307   .0189175      .2795531    .3537083
 (1 vs 0) 6  |   .1182164   .0252829      .0686628      .16777
 (1 vs 0) 7  |  -.1053685   .0193225     -.1432399   -.0674971
--------------------------------------------------------------

The marginsplot command will graph these results for us.

. marginsplot

  Variables that uniquely identify margins: x

So the effect increases over small \(x_i\) and decreases as \(x_i\) grows large. We can use margins and marginsplot again to examine the conditional means at different values of \(x_i\). This time, we specify the over() option so that separate predictions are made for \(d_i=1\) and \(d_i=0\). We expect to see the lines cross at a certain point, as the covariate effect crossed zero in the previous plot.

. margins, at(x = (-1 -.5 0 .5 1 1.5 2)) over(d)  
>         expression(normal(xb())) noatlegend     
>         vce(unconditional)

Predictive margins                              Number of obs     =      5,000

Expression   : normal(xb())
over         : d

------------------------------------------------------------------------------
             |            Unconditional
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _at#d |
        1 0  |   .2117609    .013226    16.01   0.000     .1858384    .2376834
        1 1  |   .3602933   .0194334    18.54   0.000     .3222046    .3983819
        2 0  |   .3042454   .0135516    22.45   0.000     .2776847    .3308061
        2 1  |   .6421908   .0142493    45.07   0.000     .6142627    .6701188
        3 0  |   .3616899   .0145202    24.91   0.000     .3332308     .390149
        3 1  |   .7455187   .0121556    61.33   0.000     .7216941    .7693432
        4 0  |      .3743    .014849    25.21   0.000     .3451965    .4034035
        4 1  |   .7475205   .0120208    62.19   0.000     .7239601    .7710809
        5 0  |   .3406649    .014334    23.77   0.000     .3125709     .368759
        5 1  |   .6504099   .0141534    45.95   0.000     .6226699      .67815
        6 0  |   .2651493   .0141303    18.76   0.000     .2374545    .2928442
        6 1  |   .3777989   .0215394    17.54   0.000     .3355825    .4200153
        7 0  |   .1642177    .014551    11.29   0.000     .1356982    .1927372
        7 1  |   .0565962   .0128524     4.40   0.000      .031406    .0817864
------------------------------------------------------------------------------

. marginsplot

  Variables that uniquely identify margins: x d

We see that the conditional means for \(d_{i}=0\) rise above the means for \(d_{i}=0\) at slightly below \(x_i = 1.75\).

Differential effects

Instead of a unit change, we may be interested in the differential effect. This is the normalized effect on the mean of a small change in the covariate, the derivative of the mean with regard to the covariate \(x_{ij}\). This is called the marginal or partial effect of \(x_{ij}\) on \(E(y_i\vert {\bf x}_i)\). See section 2.2.5 of Wooldridge (2010), section 5.2.4 of Cameron and Trivedi (2005), or section 10.6 of Cameron and Trivedi (2010) for more details. We can estimate the partial effect using margins, at fixed values of the regressors, or the mean partial effect over the population or sample.

We will use margins to estimate the mean marginal effects for the continuous covariates over the population of covariates. margins will take the derivatives for us if we specify dydx(). We only need to specify the form of the prediction. We again use the expression() option for this purpose.

. margins, expression(normal(xb())) vce(unconditional) dydx(x z)

Average marginal effects                        Number of obs     =      5,000

Expression   : normal(xb())
dy/dx w.r.t. : x z

------------------------------------------------------------------------------
             |            Unconditional
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           x |   .0121859   .0045472     2.68   0.007     .0032736    .0210983
           z |  -.1439682   .0062519   -23.03   0.000    -.1562217   -.1317146
------------------------------------------------------------------------------

Conclusion

In this post, I have demonstrated how to use margins after gmm to estimate covariate effects for probit models. I also demonstrated how marginsplot can be used to graph covariate effects.

In future posts, we will use margins and marginsplot after gmm freely. This will let us perform marginal estimation and keep our moment conditions from becoming overcomplicated.

Appendix 1

The following code was used to generate the probit regression data.

. set seed 34

. quietly set obs 5000

. generate double x = 2*rnormal() + .1

. generate byte d = runiform() > .5

. generate double z = rchi2(1)

. generate double y = .2*x +.3*d*x - .3*(x^2) -.7*d*(x^2) 
>         -.8*z -.2*z*d + .2 + 1.5*d + rnormal() > 0

References

Cameron, A. C., and P. K. Trivedi. 2005. Microeconometrics: Methods and Applications. New York: Cambridge University Press.

——. 2010. Microeconometrics Using Stata. Rev. ed. College Station, TX: Stata Press.

Wooldridge, J. M. 2010. Econometric Analysis of Cross Section and Panel Data. 2nd ed. Cambridge, MA: MIT Press.

Categories: Statistics Tags: binary outcome, effects, gmm, marginal effects, margins, probit

Solving missing data problems using inverse-probability-weighted estimators Quantile regression allows covariate effects to differ by quantile

Estimating covariate effects after gmm

Subscribe to the Stata Blog

Recent articles

Archives

Categories

Links