Home > Programming > Programming an estimation command in Stata: Allowing for robust or cluster–robust standard errors in a poisson command using Mata

Programming an estimation command in Stata: Allowing for robust or cluster–robust standard errors in a poisson command using Mata

mypoisson3.ado adds options for a robust or a cluster–robust estimator of the variance–covariance of the estimator (VCE) to mypoisson2.ado, which I discussed in Programming an estimation command in Stata: Handling factor variables in a poisson command using Mata. mypoisson3.ado parses the vce() option using the techniques I discussed in Programming an estimation command in Stata: Adding robust and cluster–robust VCEs to our Mata based OLS command. Below, I show how to use optimize() to compute the robust or cluster–robust VCE.

I only discuss what is new in the code for mypoisson3.ado, assuming that you are familiar with mypoisson2.ado.

This is the twenty-second post in the series Programming an estimation command in Stata. I recommend that you start at the beginning. See Programming an estimation command in Stata: A map to posted entries for a map to all the posts in this series.

A poisson command with options for a robust or a cluster–robust VCE

mypoisson3 computes Poisson-regression results in Mata. The syntax of the mypoisson3 command is

mypoisson3 depvar indepvars [if] [in] [, vce(robust | cluster clustervar) noconstant]

where indepvars can contain factor variables or time-series variables.

In the remainder of this post, I discuss the code for mypoisson3.ado. I recommend that you click on the filename to download the code. To avoid scrolling, view the code in the Do-file Editor, or your favorite text editor, to see the line numbers.

Code block 1: mypoisson3.ado

*! version 3.0.0  21Feb2016
program define mypoisson3, eclass sortpreserve
    version 14

    syntax varlist(numeric ts fv min=2) [if] [in] [, noCONStant vce(string) ]
    marksample touse

    _vce_parse `touse' , optlist(Robust) argoptlist(CLuster) : , vce(`vce')
    local vce        "`r(vce)'"
    local clustervar "`r(cluster)'"
    if "`vce'" == "robust" | "`vce'" == "cluster" {
        local vcetype "Robust"
    }
    if "`clustervar'" != "" {
        capture confirm numeric variable `clustervar'
        if _rc {
            display in red "invalid vce() option"
            display in red "cluster variable {bf:`clustervar'} is " ///
                "string variable instead of a numeric variable"
            exit(198)
        }
        sort `clustervar'
    }

    gettoken depvar indepvars : varlist
    _fv_check_depvar `depvar'

    tempname b mo V N rank

    getcinfo `indepvars' , `constant'
    local  cnames "`r(cnames)'"
    matrix `mo' = r(mo)

    mata: mywork("`depvar'", "`cnames'", "`touse'", "`constant'", ///
       "`b'", "`V'", "`N'", "`rank'", "`mo'", "`vce'", "`clustervar'")

    if "`constant'" == "" {
        local cnames "`cnames' _cons"
    }
    matrix colnames `b' = `cnames'
    matrix colnames `V' = `cnames'
    matrix rownames `V' = `cnames'

    ereturn post `b' `V', esample(`touse') buildfvinfo
    ereturn scalar N       = `N'
    ereturn scalar rank    = `rank'
    ereturn local  vce      "`vce'"
    ereturn local  vcetype  "`vcetype'"
    ereturn local  clustvar "`clustervar'"
    ereturn local  cmd     "mypoisson3"

    ereturn display

end

program getcinfo, rclass
    syntax varlist(ts fv), [ noCONStant ]

    _rmcoll `varlist' , `constant' expand
    local cnames `r(varlist)'
    local p : word count `cnames'
    if "`constant'" == "" {
        local p = `p' + 1
        local cons _cons
    }

    tempname b mo

    matrix `b' = J(1, `p', 0)
    matrix colnames `b' = `cnames' `cons'
    _ms_omit_info `b'
    matrix `mo' = r(omit)

    return local  cnames "`cnames'"
    return matrix mo = `mo'
end

mata:

void mywork( string scalar depvar,  string scalar indepvars,
             string scalar touse,   string scalar constant,
             string scalar bname,   string scalar Vname,
             string scalar nname,   string scalar rname,
             string scalar mo,
             string scalar vcetype, string scalar clustervar)
{

    real vector y, b
    real matrix X, V, Ct
    real scalar n, p, rank

    y = st_data(., depvar, touse)
    n = rows(y)
    X = st_data(., indepvars, touse)
    if (constant == "") {
        X = X,J(n, 1, 1)
    }
    p = cols(X)

    Ct = makeCt(mo)

    S  = optimize_init()
    optimize_init_argument(S, 1, y)
    optimize_init_argument(S, 2, X)
    optimize_init_evaluator(S, &plleval3())
    optimize_init_evaluatortype(S, "gf0")
    optimize_init_params(S, J(1, p, .01))
    optimize_init_constraints(S, Ct)

    b    = optimize(S)

    if (vcetype == "robust") {
        V    = optimize_result_V_robust(S)
    }
    else if (vcetype == "cluster") {
        cvar = st_data(., clustervar, touse)
        optimize_init_cluster(S, cvar)
        V    = optimize_result_V_robust(S)
    }
    else {                 // vcetype must IID
        V    = optimize_result_V_oim(S)
    }
    rank = p - diag0cnt(invsym(V))

    st_matrix(bname, b)
    st_matrix(Vname, V)
    st_numscalar(nname, n)
    st_numscalar(rname, rank)
}

real matrix makeCt(string scalar mo)
{
    real vector mo_v
    real scalar ko, j, p

    mo_v = st_matrix(mo)
    p    = cols(mo_v)
    ko   = sum(mo_v)
    if (ko>0) {
        Ct   = J(0, p, .)
        for(j=1; j<=p; j++) {
            if (mo_v[j]==1) {
                Ct  = Ct \ e(j, p)
            }
        }
        Ct = Ct, J(ko, 1, 0)
    }
    else {
        Ct = J(0,p+1,.)
    }

    return(Ct)

}

void plleval3(real scalar todo, real vector b,     ///
              real vector y,    real matrix X,     ///
              val, grad, hess)
{
    real vector  xb

    xb = X*b'
   val = (-exp(xb) + y:*xb - lnfactorial(y))
}

end

Only a few lines of mypoisson3.ado differ from their counterparts in mypoisson2.ado, and I put these changes into four groups.

  1. Line 5 allows vce() on the syntax command, and lines 8–23 parse this option.

    I discussed the techniques used in these changes in Programming an estimation com-
    mand in Stata: Adding robust and cluster–robust VCEs to our Mata based OLS command
    , when I used them in myregress12.ado. These lines

    • put the specified VCE in the local macro vce;
    • put a label for the specified VCE in the local macro vcetype;
    • put the name of a specified cluster variable in the local macro clustervar, and
    • handle any errors when the user misspecifies the vce() option.
  2. Line 35 passes the contents of the local macros vce and clustervar to the Mata work function mywork().
  3. Lines 47–49 store the local macros vce, vcetype, and clustvar in e() results.
  4. Line 84 parses the new arguments vcetype and clustervar. The string scalar vcetype contains the type of VCE to be estimated, and the string scalar clustervar contains the name of the Stata variable containing the clusters, if specified.
  5. Lines 112–122 use the contents of vcetype to return an OIM, a robust, or a cluster–robust estimator of the VCE.

    The contents of vcetype determine which optimize() function is called to compute the estimated VCE. If vcetype contains robust, line 113 uses optimize_result_V_robust() to compute a robust estimator of the VCE. If vcetype contains cluster, lines 116 and 117 put a copy of the Stata cluster variable in the optimize object, and then line 118 uses optimize_result_V_robust() to compute a cluster–robust estimator of the VCE. Finally, if vcetype is empty, line 121 uses optimize_result_V_oim() to compute the default correct-specification estimator of the VCE.

The output in examples 1 and 2 confirms that mypoisson3 produces the same results as poisson when the option vce(cluster id) is specified.

Example 1: mypoisson3 results

. clear all

. use accident3

. mypoisson3 accidents cvalue i.kids traffic, vce(cluster id)
Iteration 0:   f(p) = -847.19028
Iteration 1:   f(p) =  -573.7331
Iteration 2:   f(p) = -545.76673
Iteration 3:   f(p) = -545.11357
Iteration 4:   f(p) = -545.10898
Iteration 5:   f(p) = -545.10898
                                     (Std. Err. adjusted for clustering on id)
------------------------------------------------------------------------------
             |               Robust
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      cvalue |  -.6582924   .1128794    -5.83   0.000    -.8795319   -.4370529
             |
        kids |
          1  |  -1.662351   .4309205    -3.86   0.000     -2.50694   -.8177623
          2  |  -1.574691   .4164515    -3.78   0.000    -2.390921   -.7584611
          3  |  -3.233933   .4685643    -6.90   0.000    -4.152302   -2.315564
             |
     traffic |   .1383976   .0876168     1.58   0.114    -.0333282    .3101235
       _cons |   .7157579   .5970943     1.20   0.231    -.4545254    1.886041
------------------------------------------------------------------------------

Example 2: poisson results

. poisson accidents cvalue i.kids traffic, vce(cluster id)

Iteration 0:   log pseudolikelihood = -546.35782
Iteration 1:   log pseudolikelihood = -545.11016
Iteration 2:   log pseudolikelihood = -545.10898
Iteration 3:   log pseudolikelihood = -545.10898

Poisson regression                              Number of obs     =        505
                                                Wald chi2(5)      =     118.06
                                                Prob > chi2       =     0.0000
Log pseudolikelihood = -545.10898               Pseudo R2         =     0.2491

                                   (Std. Err. adjusted for 285 clusters in id)
------------------------------------------------------------------------------
             |               Robust
   accidents |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      cvalue |  -.6582924   .1128793    -5.83   0.000    -.8795317    -.437053
             |
        kids |
          1  |  -1.662351   .4309205    -3.86   0.000     -2.50694   -.8177622
          2  |  -1.574691   .4164515    -3.78   0.000    -2.390921    -.758461
          3  |  -3.233932   .4685642    -6.90   0.000    -4.152301   -2.315563
             |
     traffic |   .1383977   .0876167     1.58   0.114    -.0333279    .3101232
       _cons |   .7157576    .597093     1.20   0.231    -.4545232    1.886038
------------------------------------------------------------------------------

Done and undone

I discussed mypoisson3, which has options for a robust or a cluster–robust estimator of the variance–covariance of the estimator. In my next post, I discuss how to have the evaluator function compute the derivatives to speed up the optimization.