Home > Programming > Programming an estimation command in Stata: Using Stata matrix commands and functions to compute OLS objects

## Programming an estimation command in Stata: Using Stata matrix commands and functions to compute OLS objects

$$\newcommand{\epsilonb}{\boldsymbol{\epsilon}} \newcommand{\ebi}{\boldsymbol{\epsilon}_i} \newcommand{\Sigmab}{\boldsymbol{\Sigma}} \newcommand{\betab}{\boldsymbol{\beta}} \newcommand{\eb}{{\bf e}} \newcommand{\xb}{{\bf x}} \newcommand{\zb}{{\bf z}} \newcommand{\yb}{{\bf y}} \newcommand{\Xb}{{\bf X}} \newcommand{\Mb}{{\bf M}} \newcommand{\Eb}{{\bf E}} \newcommand{\Xtb}{\tilde{\bf X}} \newcommand{\Vb}{{\bf V}}$$I present the formulas for computing the ordinary least-squares (OLS) estimator, and I discuss some do-file implementations of them. I discuss the formulas and the computation of independence-based standard errors, robust standard errors, and cluster-robust standard errors. I introduce the Stata matrix commands and matrix functions that I use in ado-commands that I discuss in upcoming posts.

This is the fifth post in the series Programming an estimation command in Stata. I recommend that you start at the beginning. See Programming an estimation command in Stata: A map to posted entries for a map to all the posts in this series.

OLS formulas

Recall that the OLS point estimates are given by

$\widehat{\betab} = \left( \sum_{i=1}^N \xb_i’\xb_i \right)^{-1} \left( \sum_{i=1}^N \xb_i’y_i \right)$

where $${\bf x}_i$$ is the $$1\times k$$ vector of independent variables, $$y_i$$ is the dependent variable for each of the $$N$$ sample observations, and the model for $$y_i$$ is

$y_i = \xb_i\betab’ + \epsilon_i$

If the $$\epsilon_i$$ are independently and identically distributed, we estimate the variance-covariance matrix of the estimator (VCE) by

$\widehat{\Vb} = \widehat{s} \left( \sum_{i=1}^N \xb_i’\xb_i \right)^{-1}$

where $$\widehat{s} = 1/(N-k)\sum_{i=1}^N e_i^2$$ and $$e_i=y_i-{\bf x}_i\widehat{{\boldsymbol \beta}}$$.

See Cameron and Trivedi (2005), Stock and Watson (2010), or Wooldridge (2015) for introductions to OLS.

Stata matrix implementation

I use the matrix accum command to compute the sum of the products over the observations. Typing

.  matrix accum zpz = z1 z2 z3

puts $$\left( \sum_{i=1}^N {\bf z}_i'{\bf z}_i \right)$$ into the Stata matrix zpz, where $${\bf z}_i=( {\tt z1}_i, {\tt z2}_i, {\tt z3}_i, 1)$$. The $$1$$ appears because matrix accum has included the constant term by default, like almost all estimation commands.

Below, I use matrix accum to compute $$\left( \sum_{i=1}^N {\bf z}_i'{\bf z}_i \right)$$, which contains $$\left( \sum_{i=1}^N {\bf x}_i'{\bf x}_i \right)$$ and $$\left( \sum_{i=1}^N {\bf x}_i’y_i \right)$$.

Example 1: Using matrix accum

. sysuse auto
(1978 Automobile Data)

. matrix accum zpz = price mpg trunk
(obs=74)

. matrix list zpz

symmetric zpz[4,4]
price        mpg      trunk      _cons
price  3.448e+09
mpg    9132716      36008
trunk    6565725      20630      15340
_cons     456229       1576       1018         74


Now, I extract $$\left( \sum_{i=1}^N {\bf x}_i'{\bf x}_i \right)$$ from rows 2–4 and columns 2–4 of zpz and $$\left( \sum_{i=1}^N {\bf x}_i’y_i \right)$$ from rows 2–4 and column 1 of zpz.

Example 2: Extracting submatrices

. matrix xpx       = zpz[2..4, 2..4]

. matrix xpy       = zpz[2..4, 1]

. matrix list xpx

symmetric xpx[3,3]
mpg  trunk  _cons
mpg  36008
trunk  20630  15340
_cons   1576   1018     74

. matrix list xpy

xpy[3,1]
price
mpg  9132716
trunk  6565725
_cons   456229


I now compute $$\widehat{{\boldsymbol \beta}}$$ from the matrices formed in example 2.

Example 3: Computing $$\widehat{\betab}$$

. matrix xpxi      = syminv(xpx)

. matrix b         = xpxi*xpy

. matrix list b

b[3,1]
price
mpg  -220.16488
trunk    43.55851
_cons    10254.95

. matrix b         = b'

. matrix list b

b[1,3]
mpg       trunk       _cons
price  -220.16488    43.55851    10254.95


I transposed b to make it a row vector because point estimates in Stata are stored as row vectors.

Example 3 illustrates that the Stata matrix b contains the estimated coefficients and the names of the variables on which these values are estimated coefficients. To clarify, our model is
$\Eb[{\tt price}|{\tt mpg}, {\tt trunk} ] = {\tt mpg}*\beta_{\tt mpg} + {\tt trunk}*\beta_{\tt trunk} + {\tt \_cons}$

and b contains the information that $$-220.16$$ is the estimated coefficient on mpg, that $$43.56$$ is the estimated coefficient on trunk, and that $$10254.95$$ is the estimated constant. We can compute the linear combination $$\xb_i\widehat{\betab}$$ over the observations using the information in b, because b contains both the value and the name for each coefficient.

I use matrix score to compute this linear combination for each observation, and I use generate to reiterate what this linear combination is.

Example 4: Using matrix score to compute $$\xb_i\widehat{\betab}’$$

. matrix score double xbhat1 = b

. generate     double xbhat2 = mpg*(-220.16488) + trunk*(43.55851) + 10254.95

. list xbhat1 xbhat2 in 1/4

+-----------------------+
|    xbhat1      xbhat2 |
|-----------------------|
1. | 5890.4661   5890.4663 |
2. | 6991.2905   6991.2907 |
3. | 5934.0246   5934.0248 |
4. | 6548.5884   6548.5886 |
+-----------------------+


I use the predictions for $$\Eb[{\tt price}|{\tt mpg}, {\tt trunk} ]$$ in xbhat1 to compute the residuals and the estimated VCE.

Example 5: Computing the estimated VCE

. generate double res       = (price - xbhat1)

. generate double res2      = res^2

. summarize res2

Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
res2 |         74     6674851    1.30e+07   11.24372   9.43e+07

. return list

scalars:
r(N) =  74
r(sum_w) =  74
r(mean) =  6674850.504745401
r(Var) =  168983977867533.1
r(sd) =  12999383.74952956
r(min) =  11.24371634723049
r(max) =  94250157.2111593
r(sum) =  493938937.3511598

. local N                   = r(N)

. local sum                 = r(sum)

. local s2                  = sum'/(N'-3)

. matrix V                  = (s2')*xpxi


(See Programing an estimation command in Stata: Where to store your stuff for discussions of using results from r-class commands and using local macros.)

I verify that my computations for $$\widehat{\betab}$$ and the VCE match those of regress.

Example 6: Comparing against regress

. regress price mpg trunk

Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(2, 71)        =     10.14
Model |   141126459         2  70563229.4   Prob > F        =    0.0001
Residual |   493938937        71  6956886.44   R-squared       =    0.2222
Total |   635065396        73  8699525.97   Root MSE        =    2637.6

------------------------------------------------------------------------------
price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
mpg |  -220.1649   65.59262    -3.36   0.001    -350.9529    -89.3769
trunk |   43.55851   88.71884     0.49   0.625    -133.3418    220.4589
_cons |   10254.95   2349.084     4.37   0.000      5571.01    14938.89
------------------------------------------------------------------------------

. matrix list e(b)

e(b)[1,3]
mpg       trunk       _cons
y1  -220.16488    43.55851    10254.95

. matrix list b

b[1,3]
mpg       trunk       _cons
price  -220.16488    43.55851    10254.95

. matrix list e(V)

symmetric e(V)[3,3]
mpg       trunk       _cons
mpg   4302.3924
trunk   3384.4186   7871.0326
_cons  -138187.95  -180358.85   5518194.7

. matrix list V

symmetric V[3,3]
mpg       trunk       _cons
mpg   4302.3924
trunk   3384.4186   7871.0326
_cons  -138187.95  -180358.85   5518194.7


Robust standard errors

The frequently used robust estimator of the VCE is given by

$\widehat{V}_{robust}=\frac{N}{N-k} \left( \sum_{i=1}^N \xb_i’\xb_i \right)^{-1} \Mb \left( \sum_{i=1}^N \xb_i’\xb_i \right)^{-1}$

where

$\Mb=\sum_{i=1}^N \widehat{e}_i^2\xb_i’\xb_i$

See Cameron and Trivedi (2005), Stock and Watson (2010), or Wooldridge (2015) for derivations and discussions.

matrix accum with weights $$\widehat{e}_i^2$$ computes the formula for $$\Mb$$. Below, I use matrix accum to compute $$\Mb$$ and $$\widehat{V}_{robust}$$

Example 7: A robust VCE

. matrix accum M    = mpg trunk [iweight=res2]
(obs=493938937.4)

. matrix V2         = (N'/(N'-3))*xpxi*M*xpxi


I now verify that my computations match those reported by regress.

Example 8: Comparing computations of robust VCE

. regress price mpg trunk, robust

Linear regression                               Number of obs     =         74
F(2, 71)          =      11.59
Prob > F          =     0.0000
R-squared         =     0.2222
Root MSE          =     2637.6

------------------------------------------------------------------------------
|               Robust
price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
mpg |  -220.1649   72.45388    -3.04   0.003    -364.6338   -75.69595
trunk |   43.55851    71.4537     0.61   0.544    -98.91613    186.0331
_cons |   10254.95   2430.641     4.22   0.000      5408.39    15101.51
------------------------------------------------------------------------------

. matrix list e(V)

symmetric e(V)[3,3]
mpg       trunk       _cons
mpg   5249.5646
trunk   3569.5316   5105.6316
_cons  -169049.76  -147284.49   5908013.8

. matrix list V2

symmetric V2[3,3]
mpg       trunk       _cons
mpg   5249.5646
trunk   3569.5316   5105.6316
_cons  -169049.76  -147284.49   5908013.8


Cluster-robust standard errors

The cluster-robust estimator of the VCE is frequently used when the data have a panel structure, also known as a longitudinal structure. This VCE accounts for the within-group correlation of the errors, and it is given by

$\widehat{V}_{cluster}=\frac{N-1}{N-k}\frac{g}{g-1} \left( \sum_{i=1}^N \xb_i’\xb_i \right)^{-1} \Mb_c \left( \sum_{i=1}^N \xb_i’\xb_i \right)^{-1}$

where

$\Mb_c=\sum_{j=1}^g \Xb_j’ (\widehat{\eb}_j \widehat{\eb}_j’) \Xb_j$

$$\Xb_j$$ is the $$n_j\times k$$ matrix of observations on $$\xb_i$$ in group $$j$$, $$\widehat{\eb}_j$$ is the $$n_j\times 1$$ vector of residuals in group $$j$$, and $$g$$ is the number of groups. See Cameron and Trivedi (2005), Wooldridge (2010), and [R] regress for derivations and discussions.

matrix opaccum computes the formula for $$\Mb_c$$. Below, I create the group variable cvar from rep78 and use matrix opaccum to compute $$\Mb_c$$ and $$\widehat{V}_{cluster}$$

Example 9: A cluster-robust VCE

. generate cvar = cond( missing(rep78), 6, rep78)

. tab cvar

cvar |      Freq.     Percent        Cum.
------------+-----------------------------------
1 |          2        2.70        2.70
2 |          8       10.81       13.51
3 |         30       40.54       54.05
4 |         18       24.32       78.38
5 |         11       14.86       93.24
6 |          5        6.76      100.00
------------+-----------------------------------
Total |         74      100.00

. local Nc = r(r)

. sort cvar

. matrix opaccum M2     = mpg trunk , group(cvar) opvar(res)

. matrix V2          = ((N'-1)/(N'-3))*(Nc'/(Nc'-1))*xpxi*M2*xpxi


I now verify that my computations match those reported by regress.

Example 10: Comparing computations of cluster-robust VCE

. regress price mpg trunk, vce(cluster cvar)

Linear regression                               Number of obs     =         74
F(2, 5)           =       9.54
Prob > F          =     0.0196
R-squared         =     0.2222
Root MSE          =     2637.6

(Std. Err. adjusted for 6 clusters in cvar)
------------------------------------------------------------------------------
|               Robust
price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
mpg |  -220.1649   93.28127    -2.36   0.065     -459.952    19.62226
trunk |   43.55851   58.89644     0.74   0.493    -107.8396    194.9566
_cons |   10254.95   2448.547     4.19   0.009     3960.758    16549.14
------------------------------------------------------------------------------

. matrix list e(V)

symmetric e(V)[3,3]
mpg       trunk       _cons
mpg   8701.3957
trunk   4053.5381   3468.7911
_cons     -223021  -124190.97   5995384.3

. matrix list V2

symmetric V2[3,3]
mpg       trunk       _cons
mpg   8701.3957
trunk   4053.5381   3468.7911
_cons     -223021  -124190.97   5995384.3
`

Done and undone

I reviewed the formulas that underlie the OLS estimator and showed how to compute them using Stata matrix commands and functions. In the next two posts, I write an ado-command that implements these formulas.

References

Cameron, A. C., and P. K. Trivedi. 2005. Microeconometrics: Methods and applications. Cambridge: Cambridge University Press.

Stock, J. H., and M. W. Watson. 2010. Introduction to Econometrics. 3rd ed. Boston, MA: Addison Wesley New York.

Wooldridge, J. M. 2010. Econometric Analysis of Cross Section and Panel Data. 2nd ed. Cambridge, Massachusetts: MIT Press.

Wooldridge, J. M. 2015. Introductory Econometrics: A Modern Approach. 6th ed. Cincinnati, Ohio: South-Western.

Categories: Programming Tags: