### Archive

Posts Tagged ‘treatment effects’

## Using gmm to solve two-step estimation problems

Two-step estimation problems can be solved using the gmm command.

When a two-step estimator produces consistent point estimates but inconsistent standard errors, it is known as the two-step-estimation problem. For instance, inverse-probability weighted (IPW) estimators are a weighted average in which the weights are estimated in the first step. Two-step estimators use first-step estimates to estimate the parameters of interest in a second step. The two-step-estimation problem arises because the second step ignores the estimation error in the first step.

One solution is to convert the two-step estimator into a one-step estimator. My favorite way to do this conversion is to stack the equations solved by each of the two estimators and solve them jointly. This one-step approach produces consistent point estimates and consistent standard errors. There is no two-step problem because all the computations are performed jointly. Newey (1984) derives and justifies this approach.

I’m going to illustrate this approach with the IPW example, but it can be used with any two-step problem as long as each step is continuous.

IPW estimators are frequently used to estimate the mean that would be observed if everyone in a population received a specified treatment, a quantity known as a potential-outcome mean (POM). A difference of POMs is called the average treatment effect (ATE). Aside from all that, it is the mechanics of the two-step IPW estimator that interest me here. IPW estimators are weighted averages of the outcome, and the weights are estimated in a first step. The weights used in the second step are the inverse of the estimated probability of treatment.

Let’s imagine we are analyzing an extract of the birthweight data used by Cattaneo (2010). In this dataset, bweight is the baby’s weight at birth, mbsmoke is 1 if the mother smoked while pregnant (and 0 otherwise), mmarried is 1 if the mother is married, and prenatal1 is 1 if the mother had a prenatal visit in the first trimester.

Let’s imagine we want to estimate the mean when all pregnant women smoked, which is to say, the POM for smoking. If we were doing substantive research, we would also estimate the POM when no pregnant women smoked. The difference between these estimated POMs would then estimate the ATE of smoking.

In the IPW estimator, we begin by estimating the probability weights for smoking. We fit a probit model of mbsmoke as a function of mmarried and prenatal1.

. use cattaneo2
(Excerpt from Cattaneo (2010) Journal of Econometrics 155: 138-154)

. probit mbsmoke mmarried prenatal1, vce(robust)

Iteration 0:   log pseudolikelihood = -2230.7484
Iteration 1:   log pseudolikelihood = -2102.6994
Iteration 2:   log pseudolikelihood = -2102.1437
Iteration 3:   log pseudolikelihood = -2102.1436

Probit regression                                 Number of obs   =       4642
Wald chi2(2)    =     259.42
Prob > chi2     =     0.0000
Log pseudolikelihood = -2102.1436                 Pseudo R2       =     0.0577

------------------------------------------------------------------------------
|               Robust
mbsmoke |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
mmarried |  -.6365472   .0478037   -13.32   0.000    -.7302407   -.5428537
prenatal1 |  -.2144569   .0547583    -3.92   0.000    -.3217811   -.1071327
_cons |  -.3226297   .0471906    -6.84   0.000    -.4151215   -.2301379
------------------------------------------------------------------------------


The results indicate that both mmarried and prenatal1 significantly predict whether the mother smoked while pregnant.

We want to calculate the inverse probabilities. We begin by getting the probabilities:

. predict double pr, pr


Now, we can obtain the inverse probabilities by typing

. generate double ipw = (mbsmoke==1)/pr


We can now perform the second step: calculate the mean for smokers by using the IPWs.

. mean bweight [pw=ipw]

Mean estimation                     Number of obs    =     864

--------------------------------------------------------------
|       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
bweight |   3162.868   21.71397      3120.249    3205.486
--------------------------------------------------------------
. mean bweight [pw=ipw] if mbsmoke


The point estimate reported by mean is consistent; the reported standard error Read more…

Categories: Statistics Tags:

## Stata 13 ships June 24

There’s a new release of Stata. You can order it now, it starts shipping on June 24, and you can find out about it at www.stata.com/stata13/.

Well, we sure haven’t made that sound exciting when, in fact, Stata 13 is a big — we mean really BIG — release, and we really do want to tell you about it.

Rather than summarizing, however, we’ll send you to the website, which in addition to the standard marketing materials, has technical sheets, demonstrations, and even videos of the new features.

And all 11,000 pages of the manuals are now online.