Home > Programming > Programming an estimation command in Stata: Writing an estat postestimation command

Programming an estimation command in Stata: Writing an estat postestimation command

estat commands display statistics after estimation. Many of these statistics are diagnostics or tests used to evaluate model specification. Some statistics are available after all estimation commands; others are command specific.

I illustrate how estat commands work and then show how to write a command-specific estat command for the mypoisson command that I have been developing.

This is the 28th post in the series Programming an estimation command in Stata. I recommend that you start at the beginning. See Programming an estimation command in Stata: A map to posted entries for a map to all the posts in this series.

estat by example

I want to know whether the poisson model in example 1 is a reasonable specification of the true process.

Example 1: A Poisson model

. use accident3

. poisson accidents cvalue tickets traffic

Iteration 0:   log likelihood = -119.40895
Iteration 1:   log likelihood = -118.11766
Iteration 2:   log likelihood = -118.11672
Iteration 3:   log likelihood = -118.11672

Poisson regression                              Number of obs     =        505
                                                LR chi2(3)        =    1215.60
                                                Prob > chi2       =     0.0000
Log likelihood = -118.11672                     Pseudo R2         =     0.8373

------------------------------------------------------------------------------
   accidents |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      cvalue |   .1936988   .1054813     1.84   0.066    -.0130408    .4004385
     tickets |   1.528282   .0741663    20.61   0.000     1.382919    1.673646
     traffic |   .0822459   .0259017     3.18   0.001     .0314795    .1330123
       _cons |  -8.838649   .5160757   -17.13   0.000    -9.850139    -7.82716
------------------------------------------------------------------------------

After any estimation command, estat summarize summarizes the variables used in the previously estimated model.

Example 2: estat summarize

. estat summarize

  Estimation sample poisson                Number of obs =        505

  -------------------------------------------------------------------
      Variable |         Mean      Std. Dev.         Min          Max
  -------------+-----------------------------------------------------
     accidents |     .4712871      2.172992            0           20
        cvalue |     2.467327      1.101686            1            4
       tickets |     1.708911      2.022792            0            7
       traffic |     6.776931      2.390108     .3951611     9.998234
  -------------------------------------------------------------------

estat gof reports a goodness-of-fit test after some estimation commands. Example 3 shows that estat gof does not reject the specification of the previously estimated model.

Example 3: estat gof

. estat gof

         Deviance goodness-of-fit =  70.08114
         Prob > chi2(501)         =    1.0000

         Pearson goodness-of-fit  =   66.3997
         Prob > chi2(501)         =    1.0000

linktest implements another specification test. linktest estimates the parameters of a model, including the linear prediction and its square. Under the null hypothesis of correct specification, the coefficient on the square of the linear prediction should be zero. Example 4 shows that linktest rejects the previously fit poisson specification.

Example 4: linktest

. linktest

Iteration 0:   log likelihood = -1050.1837
Iteration 1:   log likelihood = -1004.8894  (backed up)
Iteration 2:   log likelihood = -530.12448
Iteration 3:   log likelihood = -128.25646
Iteration 4:   log likelihood =  -115.0814
Iteration 5:   log likelihood = -111.70136
Iteration 6:   log likelihood = -111.57302
Iteration 7:   log likelihood = -111.57255
Iteration 8:   log likelihood = -111.57255

Poisson regression                              Number of obs     =        505
                                                LR chi2(2)        =    1228.69
                                                Prob > chi2       =     0.0000
Log likelihood = -111.57255                     Pseudo R2         =     0.8463

------------------------------------------------------------------------------
   accidents |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        _hat |   1.269188   .1134938    11.18   0.000     1.046744    1.491632
      _hatsq |  -.1135629    .036681    -3.10   0.002    -.1854563   -.0416695
       _cons |   .0873643   .1141074     0.77   0.444    -.1362821    .3110108
------------------------------------------------------------------------------

Example 5 clarifies what linktest does.

Example 5: What linktest does

. predict double xb, xb

. generate double xb2 = xb^2

. poisson accidents xb xb2

Iteration 0:   log likelihood = -1050.1837
Iteration 1:   log likelihood = -1004.8894  (backed up)
Iteration 2:   log likelihood = -530.12475
Iteration 3:   log likelihood = -128.25648
Iteration 4:   log likelihood = -115.08141
Iteration 5:   log likelihood = -111.70136
Iteration 6:   log likelihood = -111.57302
Iteration 7:   log likelihood = -111.57255
Iteration 8:   log likelihood = -111.57255

Poisson regression                              Number of obs     =        505
                                                LR chi2(2)        =    1228.69
                                                Prob > chi2       =     0.0000
Log likelihood = -111.57255                     Pseudo R2         =     0.8463

------------------------------------------------------------------------------
   accidents |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          xb |   1.269188   .1134938    11.18   0.000     1.046744    1.491632
         xb2 |  -.1135629    .036681    -3.10   0.002    -.1854563   -.0416695
       _cons |   .0873643   .1141074     0.77   0.444    -.1362821    .3110108
------------------------------------------------------------------------------

In the next section, I show how estat works after estimation commands, and I write the command-specific estat command estat mylinktest, which performs a link test after mypoisson6, an updated version of mypoisson5.

Link test results using mypossion5

mypoisson5.ado works with predict by calling mypoisson5_p.ado, as I discussed in Programming an estimation command in Stata: Making predict work. In example 6, I use mypoisson5 and its predict command to compute the predictions and run the Poisson regression needed for a link test.

Example 6: Link test pieces from mypoisson5

. mypoisson5 accidents cvalue tickets traffic
Iteration 0:   f(p) = -838.34841
Iteration 1:   f(p) =   -419.976
Iteration 2:   f(p) = -145.89693
Iteration 3:   f(p) = -121.07379
Iteration 4:   f(p) = -118.12037
Iteration 5:   f(p) = -118.11672
Iteration 6:   f(p) = -118.11672
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      cvalue |   .1936988   .1054813     1.84   0.066    -.0130408    .4004385
     tickets |   1.528282   .0741663    20.61   0.000     1.382919    1.673646
     traffic |   .0822459   .0259017     3.18   0.001     .0314795    .1330123
       _cons |  -8.838649   .5160757   -17.13   0.000    -9.850139    -7.82716
------------------------------------------------------------------------------

. predict double myxb , xb

. generate double myxb2 = myxb^2

. mypoisson5 accidents myxb myxb2
Iteration 0:   f(p) = -1003.1259
Iteration 1:   f(p) = -641.52691
Iteration 2:   f(p) = -185.06148
Iteration 3:   f(p) = -137.97727
Iteration 4:   f(p) = -121.82821
Iteration 5:   f(p) = -114.13233
Iteration 6:   f(p) = -111.62662
Iteration 7:   f(p) = -111.57257
Iteration 8:   f(p) = -111.57255
Iteration 9:   f(p) = -111.57255
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        myxb |   1.269188   .1134938    11.18   0.000     1.046744    1.491632
       myxb2 |  -.1135629    .036681    -3.10   0.002    -.1854563   -.0416695
       _cons |   .0873643   .1141074     0.77   0.444    -.1362821    .3110108
------------------------------------------------------------------------------

mypossion6 and estat_mylinktest

estat works much like predict: you store the name of your command-specific estat command in the e() results of your estimation command. Aside from the name change, only lines 44 and 51 in mypoisson6.ado differ from their counterparts in mypoisson5.ado.

Code block 1: mypoisson6.ado

*! version 6.0.0  18Oct2016
program define mypoisson6, eclass sortpreserve
    version 14.2

    syntax varlist(numeric ts fv min=2) [if] [in] [, noCONStant vce(string) ]
    marksample touse

    _vce_parse `touse' , optlist(Robust) argoptlist(CLuster) : , vce(`vce')
    local vce        "`r(vce)'"
    local clustervar "`r(cluster)'"
    if "`vce'" == "robust" | "`vce'" == "cluster" {
        local vcetype "Robust"
    }
    if "`clustervar'" != "" {
        capture confirm numeric variable `clustervar'
        if _rc {
            display in red "invalid vce() option"
            display in red "cluster variable {bf:`clustervar'} is " ///
                "string variable instead of a numeric variable"
            exit(198)
        }
        sort `clustervar'
    }

    gettoken depvar indepvars : varlist
    _fv_check_depvar `depvar'

    tempname b mo V N rank

    getcinfo `indepvars' , `constant'
    local  cnames "`r(cnames)'"
    matrix `mo' = r(mo)

    mata: mywork("`depvar'", "`cnames'", "`touse'", "`constant'", ///
       "`b'", "`V'", "`N'", "`rank'", "`mo'", "`vce'", "`clustervar'")

    if "`constant'" == "" {
        local cnames "`cnames' _cons"
    }
    matrix colnames `b' = `cnames'
    matrix colnames `V' = `cnames'
    matrix rownames `V' = `cnames'

    ereturn post `b' `V', esample(`touse') buildfvinfo depname(`depvar')
    ereturn scalar N         = `N'
    ereturn scalar rank      = `rank'
    ereturn local  vce       "`vce'"
    ereturn local  vcetype   "`vcetype'"
    ereturn local  clustvar  "`clustervar'"
    ereturn local  predict   "mypoisson6_p"
    ereturn local  estat_cmd "mypoisson6_estat"
    ereturn local  cmd       "mypoisson6"

    ereturn display

end

program getcinfo, rclass
    syntax varlist(ts fv), [ noCONStant ]

    _rmcoll `varlist' , `constant' expand
    local cnames `r(varlist)'
    local p : word count `cnames'
    if "`constant'" == "" {
        local p = `p' + 1
        local cons _cons
    }

    tempname b mo

    matrix `b' = J(1, `p', 0)
    matrix colnames `b' = `cnames' `cons'
    _ms_omit_info `b'
    matrix `mo' = r(omit)

    return local  cnames "`cnames'"
    return matrix mo = `mo'
end

mata:

void mywork( string scalar depvar,  string scalar indepvars,
             string scalar touse,   string scalar constant,
             string scalar bname,   string scalar Vname,
             string scalar nname,   string scalar rname,
             string scalar mo,
             string scalar vcetype, string scalar clustervar)
{

    real vector y, b
    real matrix X, V, Ct
    real scalar n, p, rank

    y = st_data(., depvar, touse)
    n = rows(y)
    X = st_data(., indepvars, touse)
    if (constant == "") {
        X = X,J(n, 1, 1)
    }
    p = cols(X)

    Ct = makeCt(mo)

    S  = optimize_init()
    optimize_init_argument(S, 1, y)
    optimize_init_argument(S, 2, X)
    optimize_init_evaluator(S, &plleval3())
    optimize_init_evaluatortype(S, "gf2")
    optimize_init_params(S, J(1, p, .01))
    optimize_init_constraints(S, Ct)

    b    = optimize(S)

    if (vcetype == "robust") {
        V    = optimize_result_V_robust(S)
    }
    else if (vcetype == "cluster") {
        cvar = st_data(., clustervar, touse)
        optimize_init_cluster(S, cvar)
        V    = optimize_result_V_robust(S)
    }
    else {                 // vcetype must IID
        V    = optimize_result_V_oim(S)
    }
    rank = p - diag0cnt(invsym(V))

    st_matrix(bname, b)
    st_matrix(Vname, V)
    st_numscalar(nname, n)
    st_numscalar(rname, rank)
}

real matrix makeCt(string scalar mo)
{
    real vector mo_v
    real scalar ko, j, p

    mo_v = st_matrix(mo)
    p    = cols(mo_v)
    ko   = sum(mo_v)
    if (ko>0) {
        Ct   = J(0, p, .)
        for(j=1; j<=p; j++) {
            if (mo_v[j]==1) {
                Ct  = Ct \ e(j, p)
            }
        }
        Ct = Ct, J(ko, 1, 0)
    }
    else {
        Ct = J(0,p+1,.)
    }

    return(Ct)

}

void plleval3(real scalar todo, real vector b,     ///
              real vector y,    real matrix X,     ///
              val, grad, hess)
{
    real vector  xb, mu

    xb  = X*b'
    mu  = exp(xb)
    val = (-mu + y:*xb - lnfactorial(y))

    if (todo>=1) {
        grad = (y - mu):*X
    }
    if (todo==2) {
        hess = -quadcross(X, mu, X)
    }
}

end

Line 44 now specifies option depvar() on ereturn post to store the name of the dependent variable in e(depvar). Line 51 is new; it stores mypoisson6_estat in e(estat_cmd) so that estat can call the command-specific estat command mypoisson6_estat listed in code block 2.

Code block 2: mypoisson6_estat.ado

*! version 1.0.0  18Oct2016
program mypoisson6_estat, rclass
    version 14.2

    if "`e(cmd)'" != "mypoisson6" {
        error 301
    }

    gettoken subcmd rest : 0, parse(" ,")
    if "`subcmd'"=="mylinktest" {
        tempname eresults
        tempvar  xb xb2

        local depvar = e(depvar)
        predict double `xb' , xb
        generate double `xb2' = `xb'^2
        nobreak {
            _estimates hold `eresults'
            mylinktestwork `depvar' `xb' `xb2'
            local chi2 = e(lt_chi2)
            local df   = e(lt_df)
            local p    = e(lt_p)
            _estimates unhold `eresults'
        }
        return clear
        return scalar chi2 = `chi2'
        return scalar df   = `df'
        return scalar p    = `p'
    }
    else {
        estat_default `0'
        return add
    }
end

program mylinktestwork, eclass

    syntax varlist(min=3 max=3)
    tempvar  b V

    quietly mypoisson6 `varlist'
    matrix `b' = e(b)
    matrix colnames `b' =  _hat _hatsq _cons
    ereturn repost  b = `b' , rename
    ereturn display
    quietly test _hatsq
    local chi2 = r(chi2)
    local df   = r(df)
    local p    = r(p)
    ereturn scalar lt_chi2 = `chi2'
    ereturn scalar lt_df   = `df'
    ereturn scalar lt_p    = `p'
end

Lines 5–7 ensure that estat can only call mypoisson6_estat when the active e() results were produced by mypoisson6.

estat passes everything after the name “estat” to the command-specific estat command. For example, if I type

estat mylinktest

estat will pass “mylinktest” to mypoisson6_estat.

If I type

estat summarize, noheader

estat will pass “summarize, noheader” to mypoisson6_estat. Line 9 uses gettoken to parse what estat passes; it will put the name of the estat subcommand in the local macro subcmd and everything that follows in the local macro rest.

Line 10 specifies that lines 11–28 will be executed if the user types estat mylinktest and that lines 31–32 will otherwise be executed. Lines 11–28 implement the link test. Lines 31–32 use estat_default to produce the results for the not-command-specific estat commands, like estat summarize.

Lines 11–16 put the required predictions into the temporary variables xb and xb2. Line 17 ensures that lines 18–23 will be executed even if the user presses break while these lines are being executed. Line 18 stores the mypoisson6 e() results in memory, and line 23 puts them back in e(), overwriting the e() results used by mylinktestwork to compute and report the link test. Putting lines 18–23 in a nobreak block ensures that the original mypoisson6 e()
results are left active, even though mylinktestwork produces its own e() results.

Lines 36–53 implement the work routine mylinktestwork. Line 43 puts the point estimates computed by running mypoisson6 on the dependent variable, the linear prediction, and its square in the Stata matrix stored in the local macro b. (For simplicity, let me call this vector b.) At this point, the column names of b are the names of the temporary variables contained in the local macros `xb’ and `xb2′. These temporary names are not informative. Line 43 puts informative column names names on b. Line 44 uses ereturn repost with option rename to store this new b in e(b) and to rename the row and column names of e(V) to be the column names in b. Line 45 displays the nicely named link-test results. Lines 46–52 compute and store the link-test results that are retrieved and subsequently stored in r() by lines 20–22 and lines 25–28, respectively.

Line 15 uses predict, which is now handled by mypoisson6_p.ado.

Code block 3: mypoisson6_p.ado

*! version 1.0.0  18Oct2016
program define mypoisson6_p
    version 14.2

    syntax newvarname [if] [in] , [ xb n ]

    marksample touse, novarlist

    local nopts : word count `xb' `n'
    if `nopts' >1 {
        display "{err}only one statistic may be specified"
        exit 498
    }

    if `nopts' == 0 {
        local n n
        display "expected counts"
    }

    if "`xb'" != "" {
        _predict `typlist' `varlist' if `touse' , xb
    }
    else {
        tempvar xbv
        quietly _predict double `xbv' if `touse' , xb
        generate `typlist' `varlist' = exp(`xbv') if `touse'
    }
end

Example 7 illustrates the process and shows the output produced.

Example 7: estat mylinktest

. mypoisson6 accidents cvalue tickets traffic
Iteration 0:   f(p) = -838.34841
Iteration 1:   f(p) =   -419.976
Iteration 2:   f(p) = -145.89693
Iteration 3:   f(p) = -121.07379
Iteration 4:   f(p) = -118.12037
Iteration 5:   f(p) = -118.11672
Iteration 6:   f(p) = -118.11672
------------------------------------------------------------------------------
   accidents |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      cvalue |   .1936988   .1054813     1.84   0.066    -.0130408    .4004385
     tickets |   1.528282   .0741663    20.61   0.000     1.382919    1.673646
     traffic |   .0822459   .0259017     3.18   0.001     .0314795    .1330123
       _cons |  -8.838649   .5160757   -17.13   0.000    -9.850139    -7.82716
------------------------------------------------------------------------------

. estat mylinktest
------------------------------------------------------------------------------
   accidents |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        _hat |   1.269188   .1134938    11.18   0.000     1.046744    1.491632
      _hatsq |  -.1135629    .036681    -3.10   0.002    -.1854563   -.0416695
       _cons |   .0873643   .1141074     0.77   0.444    -.1362821    .3110108
------------------------------------------------------------------------------

Example 8 illustrates that mypoisson6_estat enables estat summarize to work.

Example 8: estat summarize

. estat summarize

  Estimation sample mypoisson6             Number of obs =        505

  -------------------------------------------------------------------
      Variable |         Mean      Std. Dev.         Min          Max
  -------------+-----------------------------------------------------
     accidents |     .4712871      2.172992            0           20
        cvalue |     2.467327      1.101686            1            4
       tickets |     1.708911      2.022792            0            7
       traffic |     6.776931      2.390108     .3951611     9.998234
  -------------------------------------------------------------------

Done and undone

I illustrated how estat commands work and how to write a command-specific estat command for the mypoisson command that I have been developing.