## Programming an estimation command in Stata: Allowing for robust or cluster–robust standard errors in a poisson command using Mata

**mypoisson3.ado** adds options for a robust or a cluster–robust estimator of the variance–covariance of the estimator (VCE) to **mypoisson2.ado**, which I discussed in Programming an estimation command in Stata: Handling factor variables in a poisson command using Mata. **mypoisson3.ado** parses the **vce()** option using the techniques I discussed in Programming an estimation command in Stata: Adding robust and cluster–robust VCEs to our Mata based OLS command. Below, I show how to use **optimize()** to compute the robust or cluster–robust VCE.

I only discuss what is new in the code for **mypoisson3.ado**, assuming that you are familiar with **mypoisson2.ado**.

This is the twenty-second post in the series **Programming an estimation command in Stata**. I recommend that you start at the beginning. See Programming an estimation command in Stata: A map to posted entries for a map to all the posts in this series.

**A poisson command with options for a robust or a cluster–robust VCE**

**mypoisson3** computes Poisson-regression results in Mata. The syntax of the **mypoisson3** command is

**mypoisson3** *depvar* *indepvars* [if] [in] [**,** **vce( robust** |

__cl__uster*clustervar*

**)**

**]**

__nocons__tantwhere *indepvars* can contain factor variables or time-series variables.

In the remainder of this post, I discuss the code for **mypoisson3.ado**. I recommend that you click on the filename to download the code. To avoid scrolling, view the code in the Do-file Editor, or your favorite text editor, to see the line numbers.

**Code block 1: mypoisson3.ado**

*! version 3.0.0 21Feb2016 program define mypoisson3, eclass sortpreserve version 14 syntax varlist(numeric ts fv min=2) [if] [in] [, noCONStant vce(string) ] marksample touse _vce_parse `touse' , optlist(Robust) argoptlist(CLuster) : , vce(`vce') local vce "`r(vce)'" local clustervar "`r(cluster)'" if "`vce'" == "robust" | "`vce'" == "cluster" { local vcetype "Robust" } if "`clustervar'" != "" { capture confirm numeric variable `clustervar' if _rc { display in red "invalid vce() option" display in red "cluster variable {bf:`clustervar'} is " /// "string variable instead of a numeric variable" exit(198) } sort `clustervar' } gettoken depvar indepvars : varlist _fv_check_depvar `depvar' tempname b mo V N rank getcinfo `indepvars' , `constant' local cnames "`r(cnames)'" matrix `mo' = r(mo) mata: mywork("`depvar'", "`cnames'", "`touse'", "`constant'", /// "`b'", "`V'", "`N'", "`rank'", "`mo'", "`vce'", "`clustervar'") if "`constant'" == "" { local cnames "`cnames' _cons" } matrix colnames `b' = `cnames' matrix colnames `V' = `cnames' matrix rownames `V' = `cnames' ereturn post `b' `V', esample(`touse') buildfvinfo ereturn scalar N = `N' ereturn scalar rank = `rank' ereturn local vce "`vce'" ereturn local vcetype "`vcetype'" ereturn local clustvar "`clustervar'" ereturn local cmd "mypoisson3" ereturn display end program getcinfo, rclass syntax varlist(ts fv), [ noCONStant ] _rmcoll `varlist' , `constant' expand local cnames `r(varlist)' local p : word count `cnames' if "`constant'" == "" { local p = `p' + 1 local cons _cons } tempname b mo matrix `b' = J(1, `p', 0) matrix colnames `b' = `cnames' `cons' _ms_omit_info `b' matrix `mo' = r(omit) return local cnames "`cnames'" return matrix mo = `mo' end mata: void mywork( string scalar depvar, string scalar indepvars, string scalar touse, string scalar constant, string scalar bname, string scalar Vname, string scalar nname, string scalar rname, string scalar mo, string scalar vcetype, string scalar clustervar) { real vector y, b real matrix X, V, Ct real scalar n, p, rank y = st_data(., depvar, touse) n = rows(y) X = st_data(., indepvars, touse) if (constant == "") { X = X,J(n, 1, 1) } p = cols(X) Ct = makeCt(mo) S = optimize_init() optimize_init_argument(S, 1, y) optimize_init_argument(S, 2, X) optimize_init_evaluator(S, &plleval3()) optimize_init_evaluatortype(S, "gf0") optimize_init_params(S, J(1, p, .01)) optimize_init_constraints(S, Ct) b = optimize(S) if (vcetype == "robust") { V = optimize_result_V_robust(S) } else if (vcetype == "cluster") { cvar = st_data(., clustervar, touse) optimize_init_cluster(S, cvar) V = optimize_result_V_robust(S) } else { // vcetype must IID V = optimize_result_V_oim(S) } rank = p - diag0cnt(invsym(V)) st_matrix(bname, b) st_matrix(Vname, V) st_numscalar(nname, n) st_numscalar(rname, rank) } real matrix makeCt(string scalar mo) { real vector mo_v real scalar ko, j, p mo_v = st_matrix(mo) p = cols(mo_v) ko = sum(mo_v) if (ko>0) { Ct = J(0, p, .) for(j=1; j<=p; j++) { if (mo_v[j]==1) { Ct = Ct \ e(j, p) } } Ct = Ct, J(ko, 1, 0) } else { Ct = J(0,p+1,.) } return(Ct) } void plleval3(real scalar todo, real vector b, /// real vector y, real matrix X, /// val, grad, hess) { real vector xb xb = X*b' val = (-exp(xb) + y:*xb - lnfactorial(y)) } end

Only a few lines of **mypoisson3.ado** differ from their counterparts in **mypoisson2.ado**, and I put these changes into four groups.

- Line 5 allows
**vce()**on the**syntax**command, and lines 8–23 parse this option.I discussed the techniques used in these changes in Programming an estimation com-

mand in Stata: Adding robust and cluster–robust VCEs to our Mata based OLS command, when I used them in**myregress12.ado**. These lines- put the specified VCE in the local macro
**vce**; - put a label for the specified VCE in the local macro
**vcetype**; - put the name of a specified cluster variable in the local macro
**clustervar**, and - handle any errors when the user misspecifies the
**vce()**option.

- put the specified VCE in the local macro
- Line 35 passes the contents of the local macros
**vce**and**clustervar**to the Mata work function**mywork()**. - Lines 47–49 store the local macros
**vce**,**vcetype**, and**clustvar**in**e()**results. - Line 84 parses the new arguments
**vcetype**and**clustervar**. The string scalar**vcetype**contains the type of VCE to be estimated, and the string scalar**clustervar**contains the name of the Stata variable containing the clusters, if specified. - Lines 112–122 use the contents of
**vcetype**to return an OIM, a robust, or a cluster–robust estimator of the VCE.The contents of

**vcetype**determine which**optimize()**function is called to compute the estimated VCE. If**vcetype**contains**robust**, line 113 uses**optimize_result_V_robust()**to compute a robust estimator of the VCE. If**vcetype**contains**cluster**, lines 116 and 117 put a copy of the Stata cluster variable in the optimize object, and then line 118 uses**optimize_result_V_robust()**to compute a cluster–robust estimator of the VCE. Finally, if**vcetype**is empty, line 121 uses**optimize_result_V_oim()**to compute the default correct-specification estimator of the VCE.

The output in examples 1 and 2 confirms that **mypoisson3** produces the same results as **poisson** when the option **vce(cluster id)** is specified.

**Example 1: mypoisson3 results**

. clear all . use accident3 . mypoisson3 accidents cvalue i.kids traffic, vce(cluster id) Iteration 0: f(p) = -847.19028 Iteration 1: f(p) = -573.7331 Iteration 2: f(p) = -545.76673 Iteration 3: f(p) = -545.11357 Iteration 4: f(p) = -545.10898 Iteration 5: f(p) = -545.10898 (Std. Err. adjusted for clustering on id) ------------------------------------------------------------------------------ | Robust | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- cvalue | -.6582924 .1128794 -5.83 0.000 -.8795319 -.4370529 | kids | 1 | -1.662351 .4309205 -3.86 0.000 -2.50694 -.8177623 2 | -1.574691 .4164515 -3.78 0.000 -2.390921 -.7584611 3 | -3.233933 .4685643 -6.90 0.000 -4.152302 -2.315564 | traffic | .1383976 .0876168 1.58 0.114 -.0333282 .3101235 _cons | .7157579 .5970943 1.20 0.231 -.4545254 1.886041 ------------------------------------------------------------------------------

**Example 2: poisson results**

. poisson accidents cvalue i.kids traffic, vce(cluster id) Iteration 0: log pseudolikelihood = -546.35782 Iteration 1: log pseudolikelihood = -545.11016 Iteration 2: log pseudolikelihood = -545.10898 Iteration 3: log pseudolikelihood = -545.10898 Poisson regression Number of obs = 505 Wald chi2(5) = 118.06 Prob > chi2 = 0.0000 Log pseudolikelihood = -545.10898 Pseudo R2 = 0.2491 (Std. Err. adjusted for 285 clusters in id) ------------------------------------------------------------------------------ | Robust accidents | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- cvalue | -.6582924 .1128793 -5.83 0.000 -.8795317 -.437053 | kids | 1 | -1.662351 .4309205 -3.86 0.000 -2.50694 -.8177622 2 | -1.574691 .4164515 -3.78 0.000 -2.390921 -.758461 3 | -3.233932 .4685642 -6.90 0.000 -4.152301 -2.315563 | traffic | .1383977 .0876167 1.58 0.114 -.0333279 .3101232 _cons | .7157576 .597093 1.20 0.231 -.4545232 1.886038 ------------------------------------------------------------------------------

**Done and undone**

I discussed **mypoisson3**, which has options for a robust or a cluster–robust estimator of the variance–covariance of the estimator. In my next post, I discuss how to have the evaluator function compute the derivatives to speed up the optimization.