Home > Statistics > An ordered-probit inverse probability weighted (IPW) estimator

An ordered-probit inverse probability weighted (IPW) estimator

teffects ipw uses multinomial logit to estimate the weights needed to estimate the potential-outcome means (POMs) from a multivalued treatment. I show how to estimate the POMs when the weights come from an ordered probit model. Moment conditions define the ordered probit estimator and the subsequent weighted average used to estimate the POMs. I use gmm to obtain consistent standard errors by stacking the ordered-probit moment conditions and the weighted mean moment conditions.

An ordered-probit IPW estimator

I have some simulated data in which the observed outcome y is the potential outcome corresponding to treatment state 0, 1, or 2. The treatment level t was generated from an ordered probit model with covariates x1 and x2. You can download the data by clicking on opt.dta.

An ordered probit is the first of several steps required to estimate the treatment-level 0 POM.

Example 1: Ordered probit outcome

. use opt

. oprobit t x1 x2

Iteration 0:   log likelihood = -5168.1477
Iteration 1:   log likelihood = -4332.0156
Iteration 2:   log likelihood = -4316.7593
Iteration 3:   log likelihood = -4316.7225
Iteration 4:   log likelihood = -4316.7225

Ordered probit regression                       Number of obs     =     10,000
                                                LR chi2(2)        =    1702.85
                                                Prob > chi2       =     0.0000
Log likelihood = -4316.7225                     Pseudo R2         =     0.1647

------------------------------------------------------------------------------
           t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   .7878772   .0435013    18.11   0.000     .7026162    .8731382
          x2 |   1.017705   .0438282    23.22   0.000     .9318036    1.103607
-------------+----------------------------------------------------------------
       /cut1 |   2.084122   .0335056                      2.018452    2.149792
       /cut2 |   2.824316   .0404692                      2.744997    2.903634
------------------------------------------------------------------------------

Now, I estimate the probabilities of each treatment level and use them to obtain the weights needed by the IPW estimator for each POM.

Example 2: Predicted probabilities and weights

. predict double pr0 pr1 pr2, pr

. generate double ipw0 = (t==0)/pr0

. generate double ipw1 = (t==1)/pr1

. generate double ipw2 = (t==2)/pr2

I use the ipw0 weights to estimate the POM for treatment level 0.

Example 3: Estimating POM for treatment 0

. regress y [pw=ipw0]
(sum of wgt is   9.9798e+03)

Linear regression                               Number of obs     =      8,511
                                                F(0, 8510)        =       0.00
                                                Prob > F          =          .
                                                R-squared         =     0.0000
                                                Root MSE          =     1.5629

------------------------------------------------------------------------------
             |               Robust
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |   1.105473   .0191744    57.65   0.000     1.067887    1.143059
------------------------------------------------------------------------------

The treatment-level-0 POM is estimated to be 1.11. The standard error reported by regress is not consistent because regress does not know that estimated coefficients were used to compute the weights ipw0.

I now estimate the other two POMs using regress.

Example 4: Estimating POMs for treatment levels 1 and 2

. regress y [pw=ipw1]
(sum of wgt is   9.9065e+03)

Linear regression                               Number of obs     =        974
                                                F(0, 973)         =       0.00
                                                Prob > F          =          .
                                                R-squared         =     0.0000
                                                Root MSE          =     1.6007

------------------------------------------------------------------------------
             |               Robust
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |   1.524924   .0647766    23.54   0.000     1.397806    1.652042
------------------------------------------------------------------------------

. regress y [pw=ipw2]
(sum of wgt is   9.9707e+03)

Linear regression                               Number of obs     =        515
                                                F(0, 514)         =       0.00
                                                Prob > F          =          .
                                                R-squared         =     0.0000
                                                Root MSE          =     1.6389

------------------------------------------------------------------------------
             |               Robust
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |   1.920994   .1265199    15.18   0.000     1.672434    2.169554
------------------------------------------------------------------------------

These weighted means are consistent for the treatment-level-1 and treatment-level-2 POMs, but the standard errors are not consistent, because regress does not know that the weights were estimated.

Using gmm to solve the multistep estimation problem

Each step of this IPW estimator is defined by moment conditions. Solving all the moment conditions simultaneously removes the multistep estimation problem. In this section, I use gmm to solve all the moment conditions simultaneously.

I begin by using gmm to replicate the oprobit results. The score equations solved by oprobit are essentially moment conditions.

The score equations for the ordered probit model can be expressed as three generalized functions that are multiplied by instrumental variables to obtain the moment conditions. These generalized error functions are

\begin{align*}
e_1 &=
(y==0)\frac{-\phi(a_1-xb)}{F(a_1-xb)}
+ (y==1)\frac{-(\phi(a_2-xb)-\phi(a_1-xb))}{(F(a_2-xb)-F(a_1-xb))}\\
&\quad + (y==2)\frac{-\phi(a_2-xb)}{(1-F(a_2-xb))} \\
%
e_2 &=
(y==0)\frac{\phi(a_1-xb)}{F(a_1-xb)}
+ (y==1)\frac{-\phi(a_1-xb)}{(F(a_2-xb)-F(a_1-xb))}
+ (y==2)0 \\
%
e_3 &=
(y==0)0
+ (y==1)\frac{\phi(a_2-xb)}{F(a_2-xb)-F(a_1-xb)}
+ (y==2)\frac{-\phi(a_2-xb)}{(1-F(a_2-xb))}
\end{align*}

Multiplying \(e_1\) respectively by \(x_1\) and \(x_2\) creates the two score equations that I view as moment equations that define the coefficients on \(x_1\) and \(x_2\). In other words, I form the moment conditions for the coefficients on \(x_1\) and \(x_2\) by multiplying \(e_1\) by the instrumental variables \(x_1\) and \(x_2\), respectively. Multiplying \(e_2\) by 1 creates the score equation that defines the \(a_1\) cutoff. Multiplying \(e_3\) by 1 creates the score equation that defines the \(a_2\) cutoff.

Below, I use gmm to solve these moment conditions.

Example 5: Ordered probit by gmm

. matrix b0 = (.1, .2, .1, .2)

. gmm (e1:                                                                 
>  (t==0)*(-normalden(-{xb:x1 x2}+{a1})/normal({a1}-{xb:}))                
> +(t==1)*(-(normalden({a2}-{xb:})-normalden({a1}-{xb:}))/                 
>           (normal({a2}-{xb:})-normal({a1}-{xb:})))                       
> +(t==2)*(normalden({a2}-{xb:})/(1-normal({a2}-{xb:})))                   
>  )                                                                       
>  (e2:                                                                    
>  (t==0)*(normalden({a1}-{xb:})/normal({a1}-{xb:}))                       
> +(t==1)*(-normalden({a1}-{xb:})/(normal({a2}-{xb:})-normal({a1}-{xb:}))) 
> +(t==2)*0                                                                
>  )                                                                       
>  (e3:                                                                    
>  (t==0)*0                                                                
> +(t==1)*(normalden({a2}-{xb:})/(normal({a2}-{xb:})-normal({a1}-{xb:})))  
> +(t==2)*(-normalden({a2}-{xb:})/(1-normal({a2}-{xb:})))                  
>  )                                                                       
>  ,                                                                       
>  onestep winitial(identity)                                              
>  instruments(e1: x1 x2, noconstant)                                      
>  instruments(e2: )                                                       
>  instruments(e3: ) from(b0)

Step 1
Iteration 0:   GMM criterion Q(b) =  1.1148682
Iteration 1:   GMM criterion Q(b) =  .19813694
Iteration 2:   GMM criterion Q(b) =  .01214783
Iteration 3:   GMM criterion Q(b) =    .000558
Iteration 4:   GMM criterion Q(b) =  1.254e-06
Iteration 5:   GMM criterion Q(b) =  1.962e-12
Iteration 6:   GMM criterion Q(b) =  8.527e-23

note: model is exactly identified

GMM estimation

Number of parameters =   4
Number of moments    =   4
Initial weight matrix: Identity                   Number of obs   =     10,000

------------------------------------------------------------------------------
             |               Robust
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   .7878772   .0434813    18.12   0.000     .7026554     .873099
          x2 |   1.017705   .0427286    23.82   0.000     .9339587    1.101452
-------------+----------------------------------------------------------------
         /a1 |   2.084122   .0332366    62.71   0.000     2.018979    2.149264
         /a2 |   2.824316   .0405148    69.71   0.000     2.744908    2.903723
------------------------------------------------------------------------------
Instruments for equation e1: x1 x2
Instruments for equation e2: _cons
Instruments for equation e3: _cons

. matrix b0 = e(b)

In the code above, the error function e1: and its instruments x1 and x2 define the moment conditions that define the coefficients on x1 and x2. The error function e2: defines the moment condition for the cutoff a1. The error function e3: defines the moment condition for the cutoff a2.

The observations on the generalized error functions are missing when all the parameters are assigned the same value, because \(F(a_2-xb)-F(a_1-xb)= 0\) in this case. I specified starting values using option from() because gmm uses zero as a starting value for each parameter, which makes the generalized error functions zero in this case.

I stored the point estimates in the matrix b0 to use these as starting values for the ordered probit parameters in example 7.

More details about the syntax of gmm are provided in Understanding the generalized method of moments (GMM): A simple example, Using gmm to solve two-step estimation problems, and Estimating parameters by maximum likelihood and method of moments using mlexp and gmm.

Below is the gmm syntax for estimating the three weighted means, when taking the oprobit parameters as given.

Example 6: Weighted means by gmm

. gmm (e4: ((t==0)/pr0)*(y - {POM0}))      
>     (e5: ((t==1)/pr1)*(y - {POM1}))      
>     (e6: ((t==2)/pr2)*(y - {POM2}))      
>    ,                                     
>    onestep                               
>    winitial(identity)                    
>    instruments(e4: )                     
>    instruments(e5: )                     
>    instruments(e6: )

Step 1
Iteration 0:   GMM criterion Q(b) =  7.1678846
Iteration 1:   GMM criterion Q(b) =  5.602e-27
Iteration 2:   GMM criterion Q(b) =  2.221e-32

note: model is exactly identified

GMM estimation

Number of parameters =   3
Number of moments    =   3
Initial weight matrix: Identity                   Number of obs   =     10,000

------------------------------------------------------------------------------
             |               Robust
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       /POM0 |   1.105473   .0191732    57.66   0.000     1.067894    1.143052
       /POM1 |   1.524924   .0647433    23.55   0.000     1.398029    1.651819
       /POM2 |   1.920994    .126397    15.20   0.000      1.67326    2.168727
------------------------------------------------------------------------------
Instruments for equation e4: _cons
Instruments for equation e5: _cons
Instruments for equation e6: _cons

Below, I used the ordered probit estimates stored in example 5 as starting values for the ordered probit parameters and 0.1 for the POM parameters. Combining the oprobit conditions and the weighted mean conditions yields

Example 7: Ordered probit IPW using gmm

. matrix b0 = (b0, .1, .1, .1 )

. gmm (e1:                                                                 
>  (t==0)*(-normalden(-{xb:x1 x2}+{a1})/normal({a1}-{xb:}))                
> +(t==1)*(-(normalden({a2}-{xb:})-normalden({a1}-{xb:}))/                 
>         (normal({a2}-{xb:})-normal({a1}-{xb:})))                         
> +(t==2)*(normalden({a2}-{xb:})/(1-normal({a2}-{xb:})))                   
>  )                                                                       
>  (e2:                                                                    
>  (t==0)*(normalden({a1}-{xb:})/normal({a1}-{xb:}))                       
> +(t==1)*(-normalden({a1}-{xb:})/(normal({a2}-{xb:})-normal({a1}-{xb:}))) 
> +(t==2)*0                                                                
>  )                                                                       
>  (e3:                                                                    
>  (t==0)*0                                                                
> +(t==1)*(normalden({a2}-{xb:})/(normal({a2}-{xb:})-normal({a1}-{xb:})))  
> +(t==2)*(-normalden({a2}-{xb:})/(1-normal({a2}-{xb:})))                  
>  )                                                                       
>  (e4:                                                                    
>  ((t==0)/normal({a1}-{xb:}))*(y - {POM0}))                               
>  (e5:                                                                    
>  ((t==1)/(normal({a2}-{xb:})-normal({a1}-{xb:})))*(y - {POM1}))          
>  (e6:                                                                    
>  ((t==2)/(1-normal({a2}-{xb:})))*(y - {POM2}))                           
>  ,                                                                       
>  onestep winitial(identity)                                              
>  instruments(e1: x1 x2, noconstant)                                      
>  instruments(e2: )                                                       
>  instruments(e3: )                                                       
>  instruments(e4: )                                                       
>  instruments(e5: )                                                       
>  instruments(e6: )                                                       
>  from(b0)

Step 1
Iteration 0:   GMM criterion Q(b) =  6.2961378
Iteration 1:   GMM criterion Q(b) =  1.668e-20
Iteration 2:   GMM criterion Q(b) =  1.736e-31

note: model is exactly identified

GMM estimation

Number of parameters =   7
Number of moments    =   7
Initial weight matrix: Identity                   Number of obs   =     10,000

------------------------------------------------------------------------------
             |               Robust
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |   .7878772   .0434813    18.12   0.000     .7026554     .873099
          x2 |   1.017705   .0427286    23.82   0.000     .9339587    1.101452
-------------+----------------------------------------------------------------
         /a1 |   2.084122   .0332366    62.71   0.000     2.018979    2.149264
         /a2 |   2.824316   .0405148    69.71   0.000     2.744908    2.903723
       /POM0 |   1.105473   .0181701    60.84   0.000      1.06986    1.141086
       /POM1 |   1.524924   .0615369    24.78   0.000     1.404314    1.645534
       /POM2 |   1.920994    .123474    15.56   0.000     1.678989    2.162998
------------------------------------------------------------------------------
Instruments for equation e1: x1 x2
Instruments for equation e2: _cons
Instruments for equation e3: _cons
Instruments for equation e4: _cons
Instruments for equation e5: _cons
Instruments for equation e6: _cons

The point estimates and the standard errors reported by gmm are consistent.

Done and undone

I showed how to estimate the POMs when the weights come from an ordered probit model. Moment conditions define the ordered probit estimator and the subsequent weighted average used to estimate the POMs. I used gmm to obtain consistent standard errors by stacking the ordered-probit moment conditions and the weighted mean moment conditions.

  • Dear Mr Drukker, can you please provide the (simulated) data so we, as readers of your blog, can exercise your examples too (or the syntax with which the data can be simulated)?

  • David M Drukker

    Thank you for your comment.
    I have updated this post to make the data downloadable.

  • Pingback: The Stata Blog » Solving missing data problems using inverse-probability-weighted estimators()

  • Carla Dohmwirth

    Dear Dr. Drukker, thanks for the post. Could you explain how to do this for the ATT? Thanks in advance.