Estimating covariate effects after gmm
In Stata 14.2, we added the ability to use margins to estimate covariate effects after gmm. In this post, I illustrate how to use margins and marginsplot after gmm to estimate covariate effects for a probit model.
Margins are statistics calculated from predictions of a previously fit model at fixed values of some covariates and averaging or otherwise integrating over the remaining covariates. They can be used to estimate population average parameters like the marginal mean, average treatment effect, or the average effect of a covariate on the conditional mean. I will demonstrate how using margins is useful after estimating a model with the generalized method of moments.
Probit model
For binary outcome \(y_i\) and regressors \({\bf x}_i\), the probit model assumes
\begin{equation}
y_i = {\bf 1}({\bf x}_i{\boldsymbol \beta} + \epsilon_i > 0) \nonumber
\end{equation}
where the error \(\epsilon_i\) is standard normal. The indicator function \({\bf 1}(\cdot)\) outputs 1 when its input is true and outputs 0 otherwise.
The conditional mean of \(y_i\) is
\begin{equation}
E(y_i\vert{\bf x}_i) = \Phi({\bf x}_i{\boldsymbol \beta}) \nonumber
\end{equation}
We can use the generalized method of moments (GMM) to estimate \({\boldsymbol \beta}\), with sample moment conditions
\begin{equation}
\sum_{i=1}^N \left[\left\{ y_i \frac{\phi\left({\bf x}_i{\boldsymbol \beta}\right)}{\Phi\left({\bf x}_i{\boldsymbol \beta}\right)} – (1-y_i)
\frac{\phi\left({\bf x}_i{\boldsymbol \beta}\right)}{\Phi\left(-{\bf x}_i{\boldsymbol \beta}\right)}\right\} {\bf x}_i\right] ={\bf 0} \nonumber
\end{equation}
In addition to the model parameters \({\boldsymbol \beta}\), we may also be interested in the change in \(y_i\) as we change one of the covariates in \({\bf x}_i\). How do individuals that only differ in the value of one of the regressors compare?
Suppose we want to compare differences in the regressor \(x_{ij}\). The vector \({\bf x}_{i}^{\star}\) is \({\bf x}_{i}\) with the \(j\)th regressor \(x_{ij}\) replaced by \(x_{ij}+1\).
The effect of a unit change in \(x_{ij}\) on \(y_i\) at \({\bf x}_i\) is
\begin{eqnarray*}
E(y_i\vert {\bf x}_i^{\star}) – E(y_i\vert {\bf x}_i) = \Phi\left({\bf x}_i^{\star}{\boldsymbol \beta}\right) – \Phi\left({\bf x}_i{\boldsymbol \beta}\right)
\end{eqnarray*}
If we wanted to estimate how this effect changed over the population, we could add the following sample moment condition to our GMM estimation
\begin{equation}
\sum_{i=1}^N \delta – \left[ \Phi\left({\bf x}_i^{\star}{\boldsymbol \beta}\right) – \Phi\left({\bf x}_i{\boldsymbol \beta}\right)\right] ={\bf 0} \nonumber
\end{equation}
This condition implies
\begin{equation}
\delta = \frac{1}{N}\sum_{i=1}^N \Phi\left({\bf x}_i^{\star}{\boldsymbol \beta}\right) – \Phi\left({\bf x}_i{\boldsymbol \beta}\right) ={\bf 0} \nonumber
\end{equation}
Rather than using the condition for \(\delta\) in the GMM estimation, we can directly calculate the sample average of the effect after estimation.
\begin{equation}
\hat{\delta} = \frac{1}{N} \sum_{i=1}^N \Phi\left({\bf x}_i^{\star}\widehat{\boldsymbol \beta}\right) – \Phi\left({\bf x}_i\widehat{\boldsymbol \beta}\right) \nonumber
\end{equation}
The standard error for this mean effect needs to be adjusted for the estimation of \({\boldsymbol \beta}\). We can use gmm to estimate \({\boldsymbol \beta}\) and then use margins to estimate \(\delta\) and its properly adjusted standard error. This provides flexibility. You can estimate a model with few moment conditions and then estimate multiple margins.
Covariate effects
We estimate the mean effects for a probit regression model using gmm and margins from simulated data. We regress the binary \(y_i\) on binary \(d_i\) and continuous \(x_i\) and \(z_i\). A quadratic term for \(x_i\) is included in the model, and we interact both powers of \(x_i\) and \(z_i\) with \(d_i\).
First, we use gmm to estimate \({\boldsymbol \beta}\). Factor-variable notation is used to specify the quadratic power of \(x_i\) and the interactions of the powers of \(x_i\) and \(z_i\) with \(d_i\).
. gmm (cond(y,normalden({y: i.d##(c.x c.x#c.x c.z) i.d _cons})/ > normal({y:}),-normalden({y:})/normal(-{y:}))), > instruments(i.d##(c.x c.x#c.x c.z) i.d) onestep Step 1 Iteration 0: GMM criterion Q(b) = .26129294 Iteration 1: GMM criterion Q(b) = .01621062 Iteration 2: GMM criterion Q(b) = .00206357 Iteration 3: GMM criterion Q(b) = .00033537 Iteration 4: GMM criterion Q(b) = 4.916e-06 Iteration 5: GMM criterion Q(b) = 1.539e-08 Iteration 6: GMM criterion Q(b) = 3.361e-13 note: model is exactly identified GMM estimation Number of parameters = 8 Number of moments = 8 Initial weight matrix: Unadjusted Number of obs = 5,000 ------------------------------------------------------------------------------ | Robust | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 1.d | 1.752056 .0987097 17.75 0.000 1.558588 1.945523 x | .2209241 .0311227 7.10 0.000 .1599247 .2819235 | c.x#c.x | -.2864622 .0199842 -14.33 0.000 -.3256305 -.2472939 | z | -.6813765 .0558371 -12.20 0.000 -.7908152 -.5719379 | d#c.x | 1 | .311213 .0543018 5.73 0.000 .2047835 .4176426 | d#c.x#c.x | 1 | -.7297855 .0513903 -14.20 0.000 -.8305086 -.6290624 | d#c.z | 1 | -.4272026 .0807842 -5.29 0.000 -.5855368 -.2688684 | _cons | .1180114 .0520303 2.27 0.023 .0160339 .2199888 ------------------------------------------------------------------------------ Instruments for equation 1: 0b.d 1.d x c.x#c.x z 0b.d#co.x 1.d#c.x 0b.d#co.x#co.x 1.d#c.x#c.x 0b.d#co.z 1.d#c.z _cons
Now, we use margins to estimate the mean effect of changing \(x_i\) to \(x_i+1\). We specify vce(unconditional) to estimate the mean effect over the population of \(x_i\), \(z_i\), and \(d_i\). The normal probability expression is specified in the expression() option. The expression function xb() is used to get the linear prediction. We specify the at(generate()) option and atcontrast(r) under the contrast option so that the expression at \(x_i\) will be subtracted from the expression at \(x_i+1\). nowald is specified to suppress the Wald test of the contrast.
. margins, at(x=generate(x)) at(x=generate(x+1)) vce(unconditional) > expression(normal(xb())) contrast(atcontrast(r) nowald) Contrasts of predictive margins Expression : normal(xb()) 1._at : x = x 2._at : x = x+1 -------------------------------------------------------------- | Unconditional | Contrast Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ _at | (2 vs 1) | -.0108121 .0040241 -.0186993 -.002925 --------------------------------------------------------------
Unit changes are particularly useful for evaluating the effect of discrete covariates. When a discrete covariate is specified using factor-variable notation, we can use contrast notation in margins to estimate the covariate effect.
We estimate the mean effect of changing from \(d_i=0\) to \(d_i=1\) over the population of covariates with margins. We specify the contrast r.d and the conditional mean in the expression() option. The expression will be evaluated at \(d_i=0\) and then subtracted from the expression evaluated at \(d_i=1\). We specify contrast(nowald) to suppress the Wald test of the contrast.
. margins r.d, expression(normal(xb())) vce(unconditional) contrast(nowald) Contrasts of predictive margins Expression : normal(xb()) -------------------------------------------------------------- | Unconditional | Contrast Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ d | (1 vs 0) | .1370625 .0093206 .1187945 .1553305 --------------------------------------------------------------
So on average over the population, changing from \(d_i=0\) to \(d_i=1\) and keeping other covariates constant will increase the probability of success by 0.14.
Graphing covariate effects
We have used margins to estimate the mean covariate effect over the population of covariates. We can also use margins to estimate covariate effects at fixed values of the other covariates or to average the covariate effect over certain covariates while fixing others. We may examine multiple effects to find a pattern. The marginsplot command graphs effects estimated by margins and can be helpful in these situations.
Suppose we wanted to see how the effect of a unit change in \(d_i\) varied over \(x_i\). We can use margins with the at() option to estimate the effect at different values of \(x_i\), averaged over the other covariates. We suppress the legend of fixed covariate values by specifying noatlegend.
. margins r.d, at(x = (-1 -.5 0 .5 1 1.5 2)) > expression(normal(xb())) noatlegend > vce(unconditional) contrast(nowald) Contrasts of predictive margins Expression : normal(xb()) -------------------------------------------------------------- | Unconditional | Contrast Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ d@_at | (1 vs 0) 1 | .1536608 .0230032 .1085753 .1987462 (1 vs 0) 2 | .3446265 .0184594 .3084468 .3808062 (1 vs 0) 3 | .3907978 .017575 .3563515 .4252441 (1 vs 0) 4 | .3802466 .017735 .3454866 .4150066 (1 vs 0) 5 | .3166307 .0189175 .2795531 .3537083 (1 vs 0) 6 | .1182164 .0252829 .0686628 .16777 (1 vs 0) 7 | -.1053685 .0193225 -.1432399 -.0674971 --------------------------------------------------------------
The marginsplot command will graph these results for us.
. marginsplot Variables that uniquely identify margins: x
So the effect increases over small \(x_i\) and decreases as \(x_i\) grows large. We can use margins and marginsplot again to examine the conditional means at different values of \(x_i\). This time, we specify the over() option so that separate predictions are made for \(d_i=1\) and \(d_i=0\). We expect to see the lines cross at a certain point, as the covariate effect crossed zero in the previous plot.
. margins, at(x = (-1 -.5 0 .5 1 1.5 2)) over(d) > expression(normal(xb())) noatlegend > vce(unconditional) Predictive margins Number of obs = 5,000 Expression : normal(xb()) over : d ------------------------------------------------------------------------------ | Unconditional | Margin Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _at#d | 1 0 | .2117609 .013226 16.01 0.000 .1858384 .2376834 1 1 | .3602933 .0194334 18.54 0.000 .3222046 .3983819 2 0 | .3042454 .0135516 22.45 0.000 .2776847 .3308061 2 1 | .6421908 .0142493 45.07 0.000 .6142627 .6701188 3 0 | .3616899 .0145202 24.91 0.000 .3332308 .390149 3 1 | .7455187 .0121556 61.33 0.000 .7216941 .7693432 4 0 | .3743 .014849 25.21 0.000 .3451965 .4034035 4 1 | .7475205 .0120208 62.19 0.000 .7239601 .7710809 5 0 | .3406649 .014334 23.77 0.000 .3125709 .368759 5 1 | .6504099 .0141534 45.95 0.000 .6226699 .67815 6 0 | .2651493 .0141303 18.76 0.000 .2374545 .2928442 6 1 | .3777989 .0215394 17.54 0.000 .3355825 .4200153 7 0 | .1642177 .014551 11.29 0.000 .1356982 .1927372 7 1 | .0565962 .0128524 4.40 0.000 .031406 .0817864 ------------------------------------------------------------------------------ . marginsplot Variables that uniquely identify margins: x d
We see that the conditional means for \(d_{i}=0\) rise above the means for \(d_{i}=0\) at slightly below \(x_i = 1.75\).
Differential effects
Instead of a unit change, we may be interested in the differential effect. This is the normalized effect on the mean of a small change in the covariate, the derivative of the mean with regard to the covariate \(x_{ij}\). This is called the marginal or partial effect of \(x_{ij}\) on \(E(y_i\vert {\bf x}_i)\). See section 2.2.5 of Wooldridge (2010), section 5.2.4 of Cameron and Trivedi (2005), or section 10.6 of Cameron and Trivedi (2010) for more details. We can estimate the partial effect using margins, at fixed values of the regressors, or the mean partial effect over the population or sample.
We will use margins to estimate the mean marginal effects for the continuous covariates over the population of covariates. margins will take the derivatives for us if we specify dydx(). We only need to specify the form of the prediction. We again use the expression() option for this purpose.
. margins, expression(normal(xb())) vce(unconditional) dydx(x z) Average marginal effects Number of obs = 5,000 Expression : normal(xb()) dy/dx w.r.t. : x z ------------------------------------------------------------------------------ | Unconditional | dy/dx Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x | .0121859 .0045472 2.68 0.007 .0032736 .0210983 z | -.1439682 .0062519 -23.03 0.000 -.1562217 -.1317146 ------------------------------------------------------------------------------
Conclusion
In this post, I have demonstrated how to use margins after gmm to estimate covariate effects for probit models. I also demonstrated how marginsplot can be used to graph covariate effects.
In future posts, we will use margins and marginsplot after gmm freely. This will let us perform marginal estimation and keep our moment conditions from becoming overcomplicated.
Appendix 1
The following code was used to generate the probit regression data.
. set seed 34 . quietly set obs 5000 . generate double x = 2*rnormal() + .1 . generate byte d = runiform() > .5 . generate double z = rchi2(1) . generate double y = .2*x +.3*d*x - .3*(x^2) -.7*d*(x^2) > -.8*z -.2*z*d + .2 + 1.5*d + rnormal() > 0
References
Cameron, A. C., and P. K. Trivedi. 2005. Microeconometrics: Methods and Applications. New York: Cambridge University Press.
——. 2010. Microeconometrics Using Stata. Rev. ed. College Station, TX: Stata Press.
Wooldridge, J. M. 2010. Econometric Analysis of Cross Section and Panel Data. 2nd ed. Cambridge, MA: MIT Press.