An ordered-probit inverse probability weighted (IPW) estimator
teffects ipw uses multinomial logit to estimate the weights needed to estimate the potential-outcome means (POMs) from a multivalued treatment. I show how to estimate the POMs when the weights come from an ordered probit model. Moment conditions define the ordered probit estimator and the subsequent weighted average used to estimate the POMs. I use gmm to obtain consistent standard errors by stacking the ordered-probit moment conditions and the weighted mean moment conditions.
An ordered-probit IPW estimator
I have some simulated data in which the observed outcome y is the potential outcome corresponding to treatment state 0, 1, or 2. The treatment level t was generated from an ordered probit model with covariates x1 and x2. You can download the data by clicking on opt.dta.
An ordered probit is the first of several steps required to estimate the treatment-level 0 POM.
Example 1: Ordered probit outcome
. use opt . oprobit t x1 x2 Iteration 0: log likelihood = -5168.1477 Iteration 1: log likelihood = -4332.0156 Iteration 2: log likelihood = -4316.7593 Iteration 3: log likelihood = -4316.7225 Iteration 4: log likelihood = -4316.7225 Ordered probit regression Number of obs = 10,000 LR chi2(2) = 1702.85 Prob > chi2 = 0.0000 Log likelihood = -4316.7225 Pseudo R2 = 0.1647 ------------------------------------------------------------------------------ t | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x1 | .7878772 .0435013 18.11 0.000 .7026162 .8731382 x2 | 1.017705 .0438282 23.22 0.000 .9318036 1.103607 -------------+---------------------------------------------------------------- /cut1 | 2.084122 .0335056 2.018452 2.149792 /cut2 | 2.824316 .0404692 2.744997 2.903634 ------------------------------------------------------------------------------
Now, I estimate the probabilities of each treatment level and use them to obtain the weights needed by the IPW estimator for each POM.
Example 2: Predicted probabilities and weights
. predict double pr0 pr1 pr2, pr . generate double ipw0 = (t==0)/pr0 . generate double ipw1 = (t==1)/pr1 . generate double ipw2 = (t==2)/pr2
I use the ipw0 weights to estimate the POM for treatment level 0.
Example 3: Estimating POM for treatment 0
. regress y [pw=ipw0] (sum of wgt is 9.9798e+03) Linear regression Number of obs = 8,511 F(0, 8510) = 0.00 Prob > F = . R-squared = 0.0000 Root MSE = 1.5629 ------------------------------------------------------------------------------ | Robust y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | 1.105473 .0191744 57.65 0.000 1.067887 1.143059 ------------------------------------------------------------------------------
The treatment-level-0 POM is estimated to be 1.11. The standard error reported by regress is not consistent because regress does not know that estimated coefficients were used to compute the weights ipw0.
I now estimate the other two POMs using regress.
Example 4: Estimating POMs for treatment levels 1 and 2
. regress y [pw=ipw1] (sum of wgt is 9.9065e+03) Linear regression Number of obs = 974 F(0, 973) = 0.00 Prob > F = . R-squared = 0.0000 Root MSE = 1.6007 ------------------------------------------------------------------------------ | Robust y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | 1.524924 .0647766 23.54 0.000 1.397806 1.652042 ------------------------------------------------------------------------------ . regress y [pw=ipw2] (sum of wgt is 9.9707e+03) Linear regression Number of obs = 515 F(0, 514) = 0.00 Prob > F = . R-squared = 0.0000 Root MSE = 1.6389 ------------------------------------------------------------------------------ | Robust y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- _cons | 1.920994 .1265199 15.18 0.000 1.672434 2.169554 ------------------------------------------------------------------------------
These weighted means are consistent for the treatment-level-1 and treatment-level-2 POMs, but the standard errors are not consistent, because regress does not know that the weights were estimated.
Using gmm to solve the multistep estimation problem
Each step of this IPW estimator is defined by moment conditions. Solving all the moment conditions simultaneously removes the multistep estimation problem. In this section, I use gmm to solve all the moment conditions simultaneously.
I begin by using gmm to replicate the oprobit results. The score equations solved by oprobit are essentially moment conditions.
The score equations for the ordered probit model can be expressed as three generalized functions that are multiplied by instrumental variables to obtain the moment conditions. These generalized error functions are
\begin{align*}
e_1 &=
(y==0)\frac{-\phi(a_1-xb)}{F(a_1-xb)}
+ (y==1)\frac{-(\phi(a_2-xb)-\phi(a_1-xb))}{(F(a_2-xb)-F(a_1-xb))}\\
&\quad + (y==2)\frac{-\phi(a_2-xb)}{(1-F(a_2-xb))} \\
%
e_2 &=
(y==0)\frac{\phi(a_1-xb)}{F(a_1-xb)}
+ (y==1)\frac{-\phi(a_1-xb)}{(F(a_2-xb)-F(a_1-xb))}
+ (y==2)0 \\
%
e_3 &=
(y==0)0
+ (y==1)\frac{\phi(a_2-xb)}{F(a_2-xb)-F(a_1-xb)}
+ (y==2)\frac{-\phi(a_2-xb)}{(1-F(a_2-xb))}
\end{align*}
Multiplying \(e_1\) respectively by \(x_1\) and \(x_2\) creates the two score equations that I view as moment equations that define the coefficients on \(x_1\) and \(x_2\). In other words, I form the moment conditions for the coefficients on \(x_1\) and \(x_2\) by multiplying \(e_1\) by the instrumental variables \(x_1\) and \(x_2\), respectively. Multiplying \(e_2\) by 1 creates the score equation that defines the \(a_1\) cutoff. Multiplying \(e_3\) by 1 creates the score equation that defines the \(a_2\) cutoff.
Below, I use gmm to solve these moment conditions.
Example 5: Ordered probit by gmm
. matrix b0 = (.1, .2, .1, .2) . gmm (e1: > (t==0)*(-normalden(-{xb:x1 x2}+{a1})/normal({a1}-{xb:})) > +(t==1)*(-(normalden({a2}-{xb:})-normalden({a1}-{xb:}))/ > (normal({a2}-{xb:})-normal({a1}-{xb:}))) > +(t==2)*(normalden({a2}-{xb:})/(1-normal({a2}-{xb:}))) > ) > (e2: > (t==0)*(normalden({a1}-{xb:})/normal({a1}-{xb:})) > +(t==1)*(-normalden({a1}-{xb:})/(normal({a2}-{xb:})-normal({a1}-{xb:}))) > +(t==2)*0 > ) > (e3: > (t==0)*0 > +(t==1)*(normalden({a2}-{xb:})/(normal({a2}-{xb:})-normal({a1}-{xb:}))) > +(t==2)*(-normalden({a2}-{xb:})/(1-normal({a2}-{xb:}))) > ) > , > onestep winitial(identity) > instruments(e1: x1 x2, noconstant) > instruments(e2: ) > instruments(e3: ) from(b0) Step 1 Iteration 0: GMM criterion Q(b) = 1.1148682 Iteration 1: GMM criterion Q(b) = .19813694 Iteration 2: GMM criterion Q(b) = .01214783 Iteration 3: GMM criterion Q(b) = .000558 Iteration 4: GMM criterion Q(b) = 1.254e-06 Iteration 5: GMM criterion Q(b) = 1.962e-12 Iteration 6: GMM criterion Q(b) = 8.527e-23 note: model is exactly identified GMM estimation Number of parameters = 4 Number of moments = 4 Initial weight matrix: Identity Number of obs = 10,000 ------------------------------------------------------------------------------ | Robust | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x1 | .7878772 .0434813 18.12 0.000 .7026554 .873099 x2 | 1.017705 .0427286 23.82 0.000 .9339587 1.101452 -------------+---------------------------------------------------------------- /a1 | 2.084122 .0332366 62.71 0.000 2.018979 2.149264 /a2 | 2.824316 .0405148 69.71 0.000 2.744908 2.903723 ------------------------------------------------------------------------------ Instruments for equation e1: x1 x2 Instruments for equation e2: _cons Instruments for equation e3: _cons . matrix b0 = e(b)
In the code above, the error function e1: and its instruments x1 and x2 define the moment conditions that define the coefficients on x1 and x2. The error function e2: defines the moment condition for the cutoff a1. The error function e3: defines the moment condition for the cutoff a2.
The observations on the generalized error functions are missing when all the parameters are assigned the same value, because \(F(a_2-xb)-F(a_1-xb)= 0\) in this case. I specified starting values using option from() because gmm uses zero as a starting value for each parameter, which makes the generalized error functions zero in this case.
I stored the point estimates in the matrix b0 to use these as starting values for the ordered probit parameters in example 7.
More details about the syntax of gmm are provided in Understanding the generalized method of moments (GMM): A simple example, Using gmm to solve two-step estimation problems, and Estimating parameters by maximum likelihood and method of moments using mlexp and gmm.
Below is the gmm syntax for estimating the three weighted means, when taking the oprobit parameters as given.
Example 6: Weighted means by gmm
. gmm (e4: ((t==0)/pr0)*(y - {POM0})) > (e5: ((t==1)/pr1)*(y - {POM1})) > (e6: ((t==2)/pr2)*(y - {POM2})) > , > onestep > winitial(identity) > instruments(e4: ) > instruments(e5: ) > instruments(e6: ) Step 1 Iteration 0: GMM criterion Q(b) = 7.1678846 Iteration 1: GMM criterion Q(b) = 5.602e-27 Iteration 2: GMM criterion Q(b) = 2.221e-32 note: model is exactly identified GMM estimation Number of parameters = 3 Number of moments = 3 Initial weight matrix: Identity Number of obs = 10,000 ------------------------------------------------------------------------------ | Robust | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- /POM0 | 1.105473 .0191732 57.66 0.000 1.067894 1.143052 /POM1 | 1.524924 .0647433 23.55 0.000 1.398029 1.651819 /POM2 | 1.920994 .126397 15.20 0.000 1.67326 2.168727 ------------------------------------------------------------------------------ Instruments for equation e4: _cons Instruments for equation e5: _cons Instruments for equation e6: _cons
Below, I used the ordered probit estimates stored in example 5 as starting values for the ordered probit parameters and 0.1 for the POM parameters. Combining the oprobit conditions and the weighted mean conditions yields
Example 7: Ordered probit IPW using gmm
. matrix b0 = (b0, .1, .1, .1 ) . gmm (e1: > (t==0)*(-normalden(-{xb:x1 x2}+{a1})/normal({a1}-{xb:})) > +(t==1)*(-(normalden({a2}-{xb:})-normalden({a1}-{xb:}))/ > (normal({a2}-{xb:})-normal({a1}-{xb:}))) > +(t==2)*(normalden({a2}-{xb:})/(1-normal({a2}-{xb:}))) > ) > (e2: > (t==0)*(normalden({a1}-{xb:})/normal({a1}-{xb:})) > +(t==1)*(-normalden({a1}-{xb:})/(normal({a2}-{xb:})-normal({a1}-{xb:}))) > +(t==2)*0 > ) > (e3: > (t==0)*0 > +(t==1)*(normalden({a2}-{xb:})/(normal({a2}-{xb:})-normal({a1}-{xb:}))) > +(t==2)*(-normalden({a2}-{xb:})/(1-normal({a2}-{xb:}))) > ) > (e4: > ((t==0)/normal({a1}-{xb:}))*(y - {POM0})) > (e5: > ((t==1)/(normal({a2}-{xb:})-normal({a1}-{xb:})))*(y - {POM1})) > (e6: > ((t==2)/(1-normal({a2}-{xb:})))*(y - {POM2})) > , > onestep winitial(identity) > instruments(e1: x1 x2, noconstant) > instruments(e2: ) > instruments(e3: ) > instruments(e4: ) > instruments(e5: ) > instruments(e6: ) > from(b0) Step 1 Iteration 0: GMM criterion Q(b) = 6.2961378 Iteration 1: GMM criterion Q(b) = 1.668e-20 Iteration 2: GMM criterion Q(b) = 1.736e-31 note: model is exactly identified GMM estimation Number of parameters = 7 Number of moments = 7 Initial weight matrix: Identity Number of obs = 10,000 ------------------------------------------------------------------------------ | Robust | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x1 | .7878772 .0434813 18.12 0.000 .7026554 .873099 x2 | 1.017705 .0427286 23.82 0.000 .9339587 1.101452 -------------+---------------------------------------------------------------- /a1 | 2.084122 .0332366 62.71 0.000 2.018979 2.149264 /a2 | 2.824316 .0405148 69.71 0.000 2.744908 2.903723 /POM0 | 1.105473 .0181701 60.84 0.000 1.06986 1.141086 /POM1 | 1.524924 .0615369 24.78 0.000 1.404314 1.645534 /POM2 | 1.920994 .123474 15.56 0.000 1.678989 2.162998 ------------------------------------------------------------------------------ Instruments for equation e1: x1 x2 Instruments for equation e2: _cons Instruments for equation e3: _cons Instruments for equation e4: _cons Instruments for equation e5: _cons Instruments for equation e6: _cons
The point estimates and the standard errors reported by gmm are consistent.
Done and undone
I showed how to estimate the POMs when the weights come from an ordered probit model. Moment conditions define the ordered probit estimator and the subsequent weighted average used to estimate the POMs. I used gmm to obtain consistent standard errors by stacking the ordered-probit moment conditions and the weighted mean moment conditions.