Programming an estimation command in Stata: Writing an estat postestimation command
estat commands display statistics after estimation. Many of these statistics are diagnostics or tests used to evaluate model specification. Some statistics are available after all estimation commands; others are command specific.
I illustrate how estat commands work and then show how to write a command-specific estat command for the mypoisson command that I have been developing.
This is the 28th post in the series Programming an estimation command in Stata. I recommend that you start at the beginning. See Programming an estimation command in Stata: A map to posted entries for a map to all the posts in this series.
estat by example
I want to know whether the poisson model in example 1 is a reasonable specification of the true process.
Example 1: A Poisson model
. use accident3 . poisson accidents cvalue tickets traffic Iteration 0: log likelihood = -119.40895 Iteration 1: log likelihood = -118.11766 Iteration 2: log likelihood = -118.11672 Iteration 3: log likelihood = -118.11672 Poisson regression Number of obs = 505 LR chi2(3) = 1215.60 Prob > chi2 = 0.0000 Log likelihood = -118.11672 Pseudo R2 = 0.8373 ------------------------------------------------------------------------------ accidents | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- cvalue | .1936988 .1054813 1.84 0.066 -.0130408 .4004385 tickets | 1.528282 .0741663 20.61 0.000 1.382919 1.673646 traffic | .0822459 .0259017 3.18 0.001 .0314795 .1330123 _cons | -8.838649 .5160757 -17.13 0.000 -9.850139 -7.82716 ------------------------------------------------------------------------------
After any estimation command, estat summarize summarizes the variables used in the previously estimated model.
Example 2: estat summarize
. estat summarize Estimation sample poisson Number of obs = 505 ------------------------------------------------------------------- Variable | Mean Std. Dev. Min Max -------------+----------------------------------------------------- accidents | .4712871 2.172992 0 20 cvalue | 2.467327 1.101686 1 4 tickets | 1.708911 2.022792 0 7 traffic | 6.776931 2.390108 .3951611 9.998234 -------------------------------------------------------------------
estat gof reports a goodness-of-fit test after some estimation commands. Example 3 shows that estat gof does not reject the specification of the previously estimated model.
Example 3: estat gof
. estat gof Deviance goodness-of-fit = 70.08114 Prob > chi2(501) = 1.0000 Pearson goodness-of-fit = 66.3997 Prob > chi2(501) = 1.0000
linktest implements another specification test. linktest estimates the parameters of a model, including the linear prediction and its square. Under the null hypothesis of correct specification, the coefficient on the square of the linear prediction should be zero. Example 4 shows that linktest rejects the previously fit poisson specification.
Example 4: linktest
. linktest Iteration 0: log likelihood = -1050.1837 Iteration 1: log likelihood = -1004.8894 (backed up) Iteration 2: log likelihood = -530.12448 Iteration 3: log likelihood = -128.25646 Iteration 4: log likelihood = -115.0814 Iteration 5: log likelihood = -111.70136 Iteration 6: log likelihood = -111.57302 Iteration 7: log likelihood = -111.57255 Iteration 8: log likelihood = -111.57255 Poisson regression Number of obs = 505 LR chi2(2) = 1228.69 Prob > chi2 = 0.0000 Log likelihood = -111.57255 Pseudo R2 = 0.8463 ------------------------------------------------------------------------------ accidents | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _hat | 1.269188 .1134938 11.18 0.000 1.046744 1.491632 _hatsq | -.1135629 .036681 -3.10 0.002 -.1854563 -.0416695 _cons | .0873643 .1141074 0.77 0.444 -.1362821 .3110108 ------------------------------------------------------------------------------
Example 5 clarifies what linktest does.
Example 5: What linktest does
. predict double xb, xb . generate double xb2 = xb^2 . poisson accidents xb xb2 Iteration 0: log likelihood = -1050.1837 Iteration 1: log likelihood = -1004.8894 (backed up) Iteration 2: log likelihood = -530.12475 Iteration 3: log likelihood = -128.25648 Iteration 4: log likelihood = -115.08141 Iteration 5: log likelihood = -111.70136 Iteration 6: log likelihood = -111.57302 Iteration 7: log likelihood = -111.57255 Iteration 8: log likelihood = -111.57255 Poisson regression Number of obs = 505 LR chi2(2) = 1228.69 Prob > chi2 = 0.0000 Log likelihood = -111.57255 Pseudo R2 = 0.8463 ------------------------------------------------------------------------------ accidents | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- xb | 1.269188 .1134938 11.18 0.000 1.046744 1.491632 xb2 | -.1135629 .036681 -3.10 0.002 -.1854563 -.0416695 _cons | .0873643 .1141074 0.77 0.444 -.1362821 .3110108 ------------------------------------------------------------------------------
In the next section, I show how estat works after estimation commands, and I write the command-specific estat command estat mylinktest, which performs a link test after mypoisson6, an updated version of mypoisson5.
Link test results using mypossion5
mypoisson5.ado works with predict by calling mypoisson5_p.ado, as I discussed in Programming an estimation command in Stata: Making predict work. In example 6, I use mypoisson5 and its predict command to compute the predictions and run the Poisson regression needed for a link test.
Example 6: Link test pieces from mypoisson5
. mypoisson5 accidents cvalue tickets traffic Iteration 0: f(p) = -838.34841 Iteration 1: f(p) = -419.976 Iteration 2: f(p) = -145.89693 Iteration 3: f(p) = -121.07379 Iteration 4: f(p) = -118.12037 Iteration 5: f(p) = -118.11672 Iteration 6: f(p) = -118.11672 ------------------------------------------------------------------------------ | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- cvalue | .1936988 .1054813 1.84 0.066 -.0130408 .4004385 tickets | 1.528282 .0741663 20.61 0.000 1.382919 1.673646 traffic | .0822459 .0259017 3.18 0.001 .0314795 .1330123 _cons | -8.838649 .5160757 -17.13 0.000 -9.850139 -7.82716 ------------------------------------------------------------------------------ . predict double myxb , xb . generate double myxb2 = myxb^2 . mypoisson5 accidents myxb myxb2 Iteration 0: f(p) = -1003.1259 Iteration 1: f(p) = -641.52691 Iteration 2: f(p) = -185.06148 Iteration 3: f(p) = -137.97727 Iteration 4: f(p) = -121.82821 Iteration 5: f(p) = -114.13233 Iteration 6: f(p) = -111.62662 Iteration 7: f(p) = -111.57257 Iteration 8: f(p) = -111.57255 Iteration 9: f(p) = -111.57255 ------------------------------------------------------------------------------ | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- myxb | 1.269188 .1134938 11.18 0.000 1.046744 1.491632 myxb2 | -.1135629 .036681 -3.10 0.002 -.1854563 -.0416695 _cons | .0873643 .1141074 0.77 0.444 -.1362821 .3110108 ------------------------------------------------------------------------------
mypossion6 and estat_mylinktest
estat works much like predict: you store the name of your command-specific estat command in the e() results of your estimation command. Aside from the name change, only lines 44 and 51 in mypoisson6.ado differ from their counterparts in mypoisson5.ado.
*! version 6.0.0 18Oct2016 program define mypoisson6, eclass sortpreserve version 14.2 syntax varlist(numeric ts fv min=2) [if] [in] [, noCONStant vce(string) ] marksample touse _vce_parse `touse' , optlist(Robust) argoptlist(CLuster) : , vce(`vce') local vce "`r(vce)'" local clustervar "`r(cluster)'" if "`vce'" == "robust" | "`vce'" == "cluster" { local vcetype "Robust" } if "`clustervar'" != "" { capture confirm numeric variable `clustervar' if _rc { display in red "invalid vce() option" display in red "cluster variable {bf:`clustervar'} is " /// "string variable instead of a numeric variable" exit(198) } sort `clustervar' } gettoken depvar indepvars : varlist _fv_check_depvar `depvar' tempname b mo V N rank getcinfo `indepvars' , `constant' local cnames "`r(cnames)'" matrix `mo' = r(mo) mata: mywork("`depvar'", "`cnames'", "`touse'", "`constant'", /// "`b'", "`V'", "`N'", "`rank'", "`mo'", "`vce'", "`clustervar'") if "`constant'" == "" { local cnames "`cnames' _cons" } matrix colnames `b' = `cnames' matrix colnames `V' = `cnames' matrix rownames `V' = `cnames' ereturn post `b' `V', esample(`touse') buildfvinfo depname(`depvar') ereturn scalar N = `N' ereturn scalar rank = `rank' ereturn local vce "`vce'" ereturn local vcetype "`vcetype'" ereturn local clustvar "`clustervar'" ereturn local predict "mypoisson6_p" ereturn local estat_cmd "mypoisson6_estat" ereturn local cmd "mypoisson6" ereturn display end program getcinfo, rclass syntax varlist(ts fv), [ noCONStant ] _rmcoll `varlist' , `constant' expand local cnames `r(varlist)' local p : word count `cnames' if "`constant'" == "" { local p = `p' + 1 local cons _cons } tempname b mo matrix `b' = J(1, `p', 0) matrix colnames `b' = `cnames' `cons' _ms_omit_info `b' matrix `mo' = r(omit) return local cnames "`cnames'" return matrix mo = `mo' end mata: void mywork( string scalar depvar, string scalar indepvars, string scalar touse, string scalar constant, string scalar bname, string scalar Vname, string scalar nname, string scalar rname, string scalar mo, string scalar vcetype, string scalar clustervar) { real vector y, b real matrix X, V, Ct real scalar n, p, rank y = st_data(., depvar, touse) n = rows(y) X = st_data(., indepvars, touse) if (constant == "") { X = X,J(n, 1, 1) } p = cols(X) Ct = makeCt(mo) S = optimize_init() optimize_init_argument(S, 1, y) optimize_init_argument(S, 2, X) optimize_init_evaluator(S, &plleval3()) optimize_init_evaluatortype(S, "gf2") optimize_init_params(S, J(1, p, .01)) optimize_init_constraints(S, Ct) b = optimize(S) if (vcetype == "robust") { V = optimize_result_V_robust(S) } else if (vcetype == "cluster") { cvar = st_data(., clustervar, touse) optimize_init_cluster(S, cvar) V = optimize_result_V_robust(S) } else { // vcetype must IID V = optimize_result_V_oim(S) } rank = p - diag0cnt(invsym(V)) st_matrix(bname, b) st_matrix(Vname, V) st_numscalar(nname, n) st_numscalar(rname, rank) } real matrix makeCt(string scalar mo) { real vector mo_v real scalar ko, j, p mo_v = st_matrix(mo) p = cols(mo_v) ko = sum(mo_v) if (ko>0) { Ct = J(0, p, .) for(j=1; j<=p; j++) { if (mo_v[j]==1) { Ct = Ct \ e(j, p) } } Ct = Ct, J(ko, 1, 0) } else { Ct = J(0,p+1,.) } return(Ct) } void plleval3(real scalar todo, real vector b, /// real vector y, real matrix X, /// val, grad, hess) { real vector xb, mu xb = X*b' mu = exp(xb) val = (-mu + y:*xb - lnfactorial(y)) if (todo>=1) { grad = (y - mu):*X } if (todo==2) { hess = -quadcross(X, mu, X) } } end
Line 44 now specifies option depvar() on ereturn post to store the name of the dependent variable in e(depvar). Line 51 is new; it stores mypoisson6_estat in e(estat_cmd) so that estat can call the command-specific estat command mypoisson6_estat listed in code block 2.
*! version 1.0.0 18Oct2016 program mypoisson6_estat, rclass version 14.2 if "`e(cmd)'" != "mypoisson6" { error 301 } gettoken subcmd rest : 0, parse(" ,") if "`subcmd'"=="mylinktest" { tempname eresults tempvar xb xb2 local depvar = e(depvar) predict double `xb' , xb generate double `xb2' = `xb'^2 nobreak { _estimates hold `eresults' mylinktestwork `depvar' `xb' `xb2' local chi2 = e(lt_chi2) local df = e(lt_df) local p = e(lt_p) _estimates unhold `eresults' } return clear return scalar chi2 = `chi2' return scalar df = `df' return scalar p = `p' } else { estat_default `0' return add } end program mylinktestwork, eclass syntax varlist(min=3 max=3) tempvar b V quietly mypoisson6 `varlist' matrix `b' = e(b) matrix colnames `b' = _hat _hatsq _cons ereturn repost b = `b' , rename ereturn display quietly test _hatsq local chi2 = r(chi2) local df = r(df) local p = r(p) ereturn scalar lt_chi2 = `chi2' ereturn scalar lt_df = `df' ereturn scalar lt_p = `p' end
Lines 5–7 ensure that estat can only call mypoisson6_estat when the active e() results were produced by mypoisson6.
estat passes everything after the name “estat” to the command-specific estat command. For example, if I type
estat mylinktest
estat will pass “mylinktest” to mypoisson6_estat.
If I type
estat summarize, noheader
estat will pass “summarize, noheader” to mypoisson6_estat. Line 9 uses gettoken to parse what estat passes; it will put the name of the estat subcommand in the local macro subcmd and everything that follows in the local macro rest.
Line 10 specifies that lines 11–28 will be executed if the user types estat mylinktest and that lines 31–32 will otherwise be executed. Lines 11–28 implement the link test. Lines 31–32 use estat_default to produce the results for the not-command-specific estat commands, like estat summarize.
Lines 11–16 put the required predictions into the temporary variables xb and xb2. Line 17 ensures that lines 18–23 will be executed even if the user presses break while these lines are being executed. Line 18 stores the mypoisson6 e() results in memory, and line 23 puts them back in e(), overwriting the e() results used by mylinktestwork to compute and report the link test. Putting lines 18–23 in a nobreak block ensures that the original mypoisson6 e()
results are left active, even though mylinktestwork produces its own e() results.
Lines 36–53 implement the work routine mylinktestwork. Line 43 puts the point estimates computed by running mypoisson6 on the dependent variable, the linear prediction, and its square in the Stata matrix stored in the local macro b. (For simplicity, let me call this vector b.) At this point, the column names of b are the names of the temporary variables contained in the local macros `xb’ and `xb2′. These temporary names are not informative. Line 43 puts informative column names names on b. Line 44 uses ereturn repost with option rename to store this new b in e(b) and to rename the row and column names of e(V) to be the column names in b. Line 45 displays the nicely named link-test results. Lines 46–52 compute and store the link-test results that are retrieved and subsequently stored in r() by lines 20–22 and lines 25–28, respectively.
Line 15 uses predict, which is now handled by mypoisson6_p.ado.
*! version 1.0.0 18Oct2016 program define mypoisson6_p version 14.2 syntax newvarname [if] [in] , [ xb n ] marksample touse, novarlist local nopts : word count `xb' `n' if `nopts' >1 { display "{err}only one statistic may be specified" exit 498 } if `nopts' == 0 { local n n display "expected counts" } if "`xb'" != "" { _predict `typlist' `varlist' if `touse' , xb } else { tempvar xbv quietly _predict double `xbv' if `touse' , xb generate `typlist' `varlist' = exp(`xbv') if `touse' } end
Example 7 illustrates the process and shows the output produced.
Example 7: estat mylinktest
. mypoisson6 accidents cvalue tickets traffic Iteration 0: f(p) = -838.34841 Iteration 1: f(p) = -419.976 Iteration 2: f(p) = -145.89693 Iteration 3: f(p) = -121.07379 Iteration 4: f(p) = -118.12037 Iteration 5: f(p) = -118.11672 Iteration 6: f(p) = -118.11672 ------------------------------------------------------------------------------ accidents | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- cvalue | .1936988 .1054813 1.84 0.066 -.0130408 .4004385 tickets | 1.528282 .0741663 20.61 0.000 1.382919 1.673646 traffic | .0822459 .0259017 3.18 0.001 .0314795 .1330123 _cons | -8.838649 .5160757 -17.13 0.000 -9.850139 -7.82716 ------------------------------------------------------------------------------ . estat mylinktest ------------------------------------------------------------------------------ accidents | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- _hat | 1.269188 .1134938 11.18 0.000 1.046744 1.491632 _hatsq | -.1135629 .036681 -3.10 0.002 -.1854563 -.0416695 _cons | .0873643 .1141074 0.77 0.444 -.1362821 .3110108 ------------------------------------------------------------------------------
Example 8 illustrates that mypoisson6_estat enables estat summarize to work.
Example 8: estat summarize
. estat summarize Estimation sample mypoisson6 Number of obs = 505 ------------------------------------------------------------------- Variable | Mean Std. Dev. Min Max -------------+----------------------------------------------------- accidents | .4712871 2.172992 0 20 cvalue | 2.467327 1.101686 1 4 tickets | 1.708911 2.022792 0 7 traffic | 6.776931 2.390108 .3951611 9.998234 -------------------------------------------------------------------
Done and undone
I illustrated how estat commands work and how to write a command-specific estat command for the mypoisson command that I have been developing.