Using gmm to solve two-step estimation problems

Two-step estimation problems can be solved using the gmm command.

When a two-step estimator produces consistent point estimates but inconsistent standard errors, it is known as the two-step-estimation problem. For instance, inverse-probability weighted (IPW) estimators are a weighted average in which the weights are estimated in the first step. Two-step estimators use first-step estimates to estimate the parameters of interest in a second step. The two-step-estimation problem arises because the second step ignores the estimation error in the first step.

One solution is to convert the two-step estimator into a one-step estimator. My favorite way to do this conversion is to stack the equations solved by each of the two estimators and solve them jointly. This one-step approach produces consistent point estimates and consistent standard errors. There is no two-step problem because all the computations are performed jointly. Newey (1984) derives and justifies this approach.

I’m going to illustrate this approach with the IPW example, but it can be used with any two-step problem as long as each step is continuous.

IPW estimators are frequently used to estimate the mean that would be observed if everyone in a population received a specified treatment, a quantity known as a potential-outcome mean (POM). A difference of POMs is called the average treatment effect (ATE). Aside from all that, it is the mechanics of the two-step IPW estimator that interest me here. IPW estimators are weighted averages of the outcome, and the weights are estimated in a first step. The weights used in the second step are the inverse of the estimated probability of treatment.

Let’s imagine we are analyzing an extract of the birthweight data used by Cattaneo (2010). In this dataset, bweight is the baby’s weight at birth, mbsmoke is 1 if the mother smoked while pregnant (and 0 otherwise), mmarried is 1 if the mother is married, and prenatal1 is 1 if the mother had a prenatal visit in the first trimester.

Let’s imagine we want to estimate the mean when all pregnant women smoked, which is to say, the POM for smoking. If we were doing substantive research, we would also estimate the POM when no pregnant women smoked. The difference between these estimated POMs would then estimate the ATE of smoking.

In the IPW estimator, we begin by estimating the probability weights for smoking. We fit a probit model of mbsmoke as a function of mmarried and prenatal1.

. use cattaneo2
(Excerpt from Cattaneo (2010) Journal of Econometrics 155: 138-154)

. probit mbsmoke mmarried prenatal1, vce(robust)

Iteration 0:   log pseudolikelihood = -2230.7484
Iteration 1:   log pseudolikelihood = -2102.6994
Iteration 2:   log pseudolikelihood = -2102.1437
Iteration 3:   log pseudolikelihood = -2102.1436

Probit regression                                 Number of obs   =       4642
                                                  Wald chi2(2)    =     259.42
                                                  Prob > chi2     =     0.0000
Log pseudolikelihood = -2102.1436                 Pseudo R2       =     0.0577

             |               Robust
     mbsmoke |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    mmarried |  -.6365472   .0478037   -13.32   0.000    -.7302407   -.5428537
   prenatal1 |  -.2144569   .0547583    -3.92   0.000    -.3217811   -.1071327
       _cons |  -.3226297   .0471906    -6.84   0.000    -.4151215   -.2301379

The results indicate that both mmarried and prenatal1 significantly predict whether the mother smoked while pregnant.

We want to calculate the inverse probabilities. We begin by getting the probabilities:

. predict double pr, pr

Now, we can obtain the inverse probabilities by typing

. generate double ipw = (mbsmoke==1)/pr

We can now perform the second step: calculate the mean for smokers by using the IPWs.

. mean bweight [pw=ipw]

Mean estimation                     Number of obs    =     864

             |       Mean   Std. Err.     [95% Conf. Interval]
     bweight |   3162.868   21.71397      3120.249    3205.486
. mean bweight [pw=ipw] if mbsmoke

The point estimate reported by mean is consistent; the reported standard error Read more…

Putting the Stata Manuals on your iPad

28 October 2014

You can install the Stata manuals on your iPad. Here’s how: install GoodReader and copy the manuals from your computer to your iPad. It takes a few minutes and will cost you about $7 to purchase the app.

Once installed, launch GoodReader, press the bookmark icon at the bottom of the screen, and GoodReader shows you the list of the manuals.


Well, that’s only a partial list. We’d have to scroll to see them all.

If you tap on a manual, it opens,


You can swipe to go forward,

All the links are live. If you tap on graph intro, the reader jumps to the manual entry,


Here are some formulas:


To illustrate formulas, I jumped to mi estimate in the [MI] manual. I can jump anywhere because I have all 21 manuals—all 11,000-plus pages—installed on my iPad.

You can have them installed on your iPad, too.

Here’s how.

Step 1. Install GoodReader on your iPad

You must purchase GoodReader 4 from the App Store. No other PDF reader will do. What makes GoodReader a good reader for the Stata manuals is that it can handle links across manuals. As of Read more…

Using gsem to combine estimation results

gsem is a very flexible command that allows us to fit very sophisticated models. However, it is also useful in situations that involve simple models.

For example, when we want to compare parameters among two or more models, we usually use suest, which combines the estimation results under one parameter vector and creates a simultaneous covariance matrix of the robust type. This covariance estimate is described in the Methods and formulas of [R] suest as the robust variance from a “stacked model”. Actually, gsem can estimate these kinds of “stacked models”, even if the estimation samples are not the same and eventually overlap. By using the option vce(robust), we can replicate the results from suest if the models are available for gsem. In addition, gsem allows us to combine results from some estimation commands that are not supported by suest, like models including random effects.


Example: Comparing parameters from two models


Let’s consider the childweight dataset, described in [ME] mixed. Consider the following models, where weights of boys and girls are modeled using the age and the age-squared:

. webuse childweight, clear
(Weight data on Asian children)

. regress  weight age c.age#c.age if girl == 0, noheader
      weight |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
         age |   7.985022   .6343855    12.59   0.000     6.725942    9.244101
 c.age#c.age |   -1.74346   .2374504    -7.34   0.000    -2.214733   -1.272187
       _cons |   3.684363   .3217223    11.45   0.000     3.045833    4.322893

. regress  weight age c.age#c.age if girl == 1, noheader
      weight |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
         age |   7.008066   .5164687    13.57   0.000     5.982746    8.033386
 c.age#c.age |  -1.450582   .1930318    -7.51   0.000    -1.833798   -1.067365
       _cons |   3.480933   .2616616    13.30   0.000     2.961469    4.000397

To test whether birthweights are the same for the two groups, we need to test whether the intercepts in the two regressions are the same. Using suest, we would proceed as follows:

. quietly regress weight age c.age#c.age if girl == 0, noheader

. estimates store boys

. quietly regress weight age c.age#c.age if girl == 1, noheader

. estimates store girls

. suest boys girls

Simultaneous results for boys, girls

                                                  Number of obs   =        198

             |               Robust
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
boys_mean    |
         age |   7.985022   .4678417    17.07   0.000     7.068069    8.901975
 c.age#c.age |   -1.74346   .2034352    -8.57   0.000    -2.142186   -1.344734
       _cons |   3.684363   .1719028    21.43   0.000      3.34744    4.021286
boys_lnvar   |
       _cons |   .4770289   .1870822     2.55   0.011     .1103546    .8437032
girls_mean   |
         age |   7.008066   .4166916    16.82   0.000     6.191365    7.824766
 c.age#c.age |  -1.450582   .1695722    -8.55   0.000    -1.782937   -1.118226
       _cons |   3.480933   .1556014    22.37   0.000      3.17596    3.785906
girls_lnvar  |
       _cons |   .0097127   .1351769     0.07   0.943    -.2552292    .2746545

Invoking an estimation command with the option coeflegend will give us a legend we can use to refer to the parameters when we use postestimation commands like test.

. suest, coeflegend

Simultaneous results for boys, girls

                                                  Number of obs   =        198

             |      Coef.  Legend
boys_mean    |
         age |   7.985022  _b[boys_mean:age]
 c.age#c.age |   -1.74346  _b[boys_mean:c.age#c.age]
       _cons |   3.684363  _b[boys_mean:_cons]
boys_lnvar   |
       _cons |   .4770289  _b[boys_lnvar:_cons]
girls_mean   |
         age |   7.008066  _b[girls_mean:age]
 c.age#c.age |  -1.450582  _b[girls_mean:c.age#c.age]
       _cons |   3.480933  _b[girls_mean:_cons]
girls_lnvar  |
       _cons |   .0097127  _b[girls_lnvar:_cons]

. test  _b[boys_mean:_cons] = _b[girls_mean:_cons]

 ( 1)  [boys_mean]_cons - [girls_mean]_cons = 0

           chi2(  1) =    0.77
         Prob > chi2 =    0.3803

We find no evidence that the intercepts are different.

Now, let’s replicate those results Read more…

Categories: Statistics Tags: , , ,

How to simulate multilevel/longitudinal data

I was recently talking with my friend Rebecca about simulating multilevel data, and she asked me if I would show her some examples. It occurred to me that many of you might also like to see some examples, so I decided to post them to the Stata Blog.




We simulate data all the time at StataCorp and for a variety of reasons.

One reason is that real datasets that include the features we would like are often difficult to find. We prefer to use real datasets in the manual examples, but sometimes that isn’t feasible and so we create simulated datasets.

We also simulate data to check the coverage probabilities of new estimators in Stata. Sometimes the formulae published in books and papers contain typographical errors. Sometimes the asymptotic properties of estimators don’t hold under certain conditions. And every once in a while, we make coding mistakes. We run simulations during development to verify that a 95% confidence interval really is a 95% confidence interval.

Simulated data can also come in handy for presentations, teaching purposes, and calculating statistical power using simulations for complex study designs.

And, simulating data is just plain fun once you get the hang of it.

Some of you will recall Vince Wiggins’s blog entry from 2011 entitled “Multilevel random effects in xtmixed and sem — the long and wide of it” in which he simulated a three-level dataset. I’m going to elaborate on how Vince simulated multilevel data, and then I’ll show you some useful variations. Specifically, I’m going to talk about:

  1. How to simulate single-level data
  2. How to simulate two- and three-level data
  3. How to simulate three-level data with covariates
  4. How to simulate longitudinal data with random slopes
  5. How to simulate longitudinal data with structured errors


How to simulate single-level data


Let’s begin by simulating a trivially simple, single-level dataset that has the form

\[y_i = 70 + e_i\]

We will assume that e is normally distributed with mean zero and variance \(\sigma^2\).

We’d want to simulate 500 observations, so let’s begin by clearing Stata’s memory and setting the number of observations to 500.

. clear 
. set obs 500

Next, let’s create a variable named e that contains pseudorandom normally distributed data with mean zero and standard deviation 5:

. generate e = rnormal(0,5)

The variable e is our error term, so we can create an outcome variable y by typing

. generate y = 70 + e

. list y e in 1/5

     |        y           e |
  1. | 78.83927     8.83927 |
  2. | 69.97774   -.0222647 |
  3. | 69.80065   -.1993514 |
  4. | 68.11398    -1.88602 |
  5. | 63.08952   -6.910483 |

We can fit a linear regression for the variable y to determine whether our parameter estimates are reasonably close to the parameters we specified when we simulated our dataset:

. regress y

      Source |       SS       df       MS              Number of obs =     500
-------------+------------------------------           F(  0,   499) =    0.00
       Model |           0     0           .           Prob > F      =       .
    Residual |  12188.8118   499  24.4264766           R-squared     =  0.0000
-------------+------------------------------           Adj R-squared =  0.0000
       Total |  12188.8118   499  24.4264766           Root MSE      =  4.9423

           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
       _cons |   69.89768    .221027   316.24   0.000     69.46342    70.33194

The estimate of _cons is 69.9, which is very close to 70, and the Root MSE of 4.9 is equally close to the error’s standard deviation of 5. The parameter estimates will not be exactly equal to the underlying parameters we specified when we created the data because we introduced randomness with the rnormal() function.

This simple example is just to get us started before we work with multilevel data. For familiarity, let’s fit the same model with the mixed command that we will be using later:

. mixed y, stddev

Mixed-effects ML regression                     Number of obs      =       500

                                                Wald chi2(0)       =         .
Log likelihood = -1507.8857                     Prob > chi2        =         .

           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
       _cons |   69.89768   .2208059   316.56   0.000     69.46491    70.33045

  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
                sd(Residual) |    4.93737   .1561334      4.640645    5.253068

The output is organized with the parameter estimates for the fixed part in the top table and the estimated standard deviations for the random effects in the bottom table. Just as previously, the estimate of _cons is 69.9, and the estimate of the standard deviation of the residuals is 4.9.

Okay. That really was trivial, wasn’t it? Simulating two- and three-level data is almost as easy.


How to simulate two- and three-level data


I posted a blog entry last year titled “Multilevel linear models in Stata, part 1: Components of variance“. In that posting, I showed a diagram for a residual of a three-level model.

The equation for the variance-components model I fit had the form

\[y_{ijk} = mu + u_i.. + u_{ij.} + e_{ijk}\]

This model had three residuals, whereas the one-level model we just fit above had only one.

This time, let’s start with a two-level model. Let’s simulate Read more…

Using resampling methods to detect influential points

As stated in the documentation for jackknife, an often forgotten utility for this command is the detection of overly influential observations.

Some commands, like logit or stcox, come with their own set of prediction tools to detect influential points. However, these kinds of predictions can be computed for virtually any regression command. In particular, we will see that the dfbeta statistics can be easily computed for any command that accepts the jackknife prefix. dfbeta statistics allow us to visualize how influential some observations are compared with the rest, concerning a specific parameter.

We will also compute Cook’s likelihood displacement, which is an overall measure of influence, and it can also be compared with a specific threshold.


Using jackknife to compute dfbeta


The main task of jackknife is to fit the model while suppressing one observation at a time, which allows us to see how much results change when each observation is suppressed; in other words, it allows us to see how much each observation influences the results. A very intuitive measure of influence is dfbeta, which is the amount that a particular parameter changes when an observation is suppressed. There will be one dfbeta variable for each parameter. If \(\hat\beta\) is the estimate for parameter \(\beta\) obtained from the full data and \( \hat\beta_{(i)} \) is the corresponding estimate obtained when the \(i\)th observation is suppressed, then the \(i\)th element of variable dfbeta is obtained as

\[dfbeta = \hat\beta – \hat\beta_{(i)}\]

Parameters \(\hat\beta\) are saved by the estimation commands in matrix e(b) and also can be obtained using the _b notation, as we will show below. The leave-one-out values \(\hat\beta_{(i)}\) can be saved in a new file by using the option saving() with jackknife. With these two elements, we can compute the dfbeta values for each variable.

Let’s see an example with the probit command.

. sysuse auto, clear
(1978 Automobile Data)

. *preserve original dataset
. preserve

. *generate a variable with the original observation number
. gen obs =_n

. probit foreign mpg weight

Iteration 0:   log likelihood =  -45.03321
Iteration 1:   log likelihood = -27.914626
Iteration 2:   log likelihood = -26.858074
Iteration 3:   log likelihood = -26.844197
Iteration 4:   log likelihood = -26.844189
Iteration 5:   log likelihood = -26.844189

Probit regression                                 Number of obs   =         74
                                                  LR chi2(2)      =      36.38
                                                  Prob > chi2     =     0.0000
Log likelihood = -26.844189                       Pseudo R2       =     0.4039

     foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
         mpg |  -.1039503   .0515689    -2.02   0.044    -.2050235   -.0028772
      weight |  -.0023355   .0005661    -4.13   0.000     -.003445   -.0012261
       _cons |   8.275464   2.554142     3.24   0.001     3.269437    13.28149

. *keep the estimation sample so each observation will be matched
. *with the corresponding replication
. keep if e(sample)
(0 observations deleted)

. *use jackknife to generate the replications, and save the values in
. *file b_replic
. jackknife, saving(b_replic, replace):  probit foreign mpg weight
(running probit on estimation sample)

Jackknife replications (74)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
..................................................    50

Probit regression                               Number of obs      =        74
                                                Replications       =        74
                                                F(   2,     73)    =     10.36
                                                Prob > F           =    0.0001
Log likelihood = -26.844189                     Pseudo R2          =    0.4039

             |              Jackknife
     foreign |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
         mpg |  -.1039503   .0831194    -1.25   0.215     -.269607    .0617063
      weight |  -.0023355   .0006619    -3.53   0.001    -.0036547   -.0010164
       _cons |   8.275464   3.506085     2.36   0.021     1.287847    15.26308

. *verify that all the replications were successful
. assert e(N_misreps) ==0

. merge 1:1 _n using b_replic

    Result                           # of obs.
    not matched                             0
    matched                                74  (_merge==3)

. *see how values from replications are stored
. describe, fullnames

Contains data from .../auto.dta
  obs:            74                          1978 Automobile Data
 vars:            17                          13 Apr 2013 17:45
 size:         4,440                          (_dta has notes)
              storage   display    value
variable name   type    format     label      variable label
make            str18   %-18s                 Make and Model
price           int     %8.0gc                Price
mpg             int     %8.0g                 Mileage (mpg)
rep78           int     %8.0g                 Repair Record 1978
headroom        float   %6.1f                 Headroom (in.)
trunk           int     %8.0g                 Trunk space (cu. ft.)
weight          int     %8.0gc                Weight (lbs.)
length          int     %8.0g                 Length (in.)
turn            int     %8.0g                 Turn Circle (ft.)
displacement    int     %8.0g                 Displacement (cu. in.)
gear_ratio      float   %6.2f                 Gear Ratio
foreign         byte    %8.0g      origin     Car type
obs             float   %9.0g
foreign_b_mpg   float   %9.0g                 [foreign]_b[mpg]
                float   %9.0g                 [foreign]_b[weight]
foreign_b_cons  float   %9.0g                 [foreign]_b[_cons]
_merge          byte    %23.0g     _merge
Sorted by:
     Note:  dataset has changed since last saved

. *compute the dfbeta for each covariate
. foreach var in mpg weight {
  2.  gen dfbeta_`var' = (_b[`var'] -foreign_b_`var')
  3. }

. gen dfbeta_cons = (_b[_cons] - foreign_b_cons)

. label var obs "observation number"
. label var dfbeta_mpg "dfbeta for mpg"
. label var dfbeta_weight "dfbeta for weight"
. label var dfbeta_cons "dfbeta for the constant"

. *plot dfbeta values for variable mpg
. scatter dfbeta_mpg obs, mlabel(obs) title("dfbeta values for variable mpg")

. *restore original dataset
. restore


Based on the impact on the Read more…

How to create animated graphics using Stata


Today I want to show you how to create animated graphics using Stata. It’s easier than you might expect and you can use animated graphics to illustrate concepts that would be challenging to illustrate with static graphs. In addition to Stata, you will need a video editing program but don’t be concerned if you don’t have one. At the 2012 UK Stata User Group Meeting Robert Grant demonstrated how to create animated graphics from within Stata using a free software program called FFmpeg. I will show you how I create my animated graphs using Camtasia and how Robert creates his using FFmpeg.

I recently recorded a video for the Stata Youtube channel called “Power and sample size calculations in Stata: A conceptual introduction“. I wanted to illustrate two concepts: (1) that statistcal power increases as sample size increases, and (2) as effect size increases. Both of these concepts can be illustrated with a static graph along with the explanation “imagine that …”. Creating animated graphs allowed me to skip the explanation and just show what I meant.



Creating the graphs

Videos are illusions. All videos — from Charles-Émile Reynaud’s 1877 praxinoscope to modern blu-ray movies — are created by displaying a series of ordered still images for a fraction of a second each. Our brains perceive this series of still images as motion.

To create the illusion of motion with graphs, Read more…

Categories: Graphics Tags: , ,

Retaining an Excel cell’s format when using putexcel

In a previous blog entry, I talked about the new Stata 13 command putexcel and how we could use putexcel with a Stata command’s stored results to create tables in an Excel file.

After the entry was posted, a few users pointed out two features they wanted added to putexcel:

  1. Retain a cell’s format after writing numeric data to it.
  2. Allow putexcel to format a cell.

In Stata 13.1, we added the new option keepcellformat to putexcel. This option retains a cell’s format after writing numeric data to it. keepcellformat is useful for people who want to automate the updating of a report or paper.

To review, the basic syntax of putexcel is as follows:

putexcel excel_cell=(expression) … using filename[, options]

If you are working with matrices, the syntax is

putexcel excel_cell=matrix(expression) … using filename[, options]

In the previous blog post, we exported a simple table created by the correlate command by using the commands below.

. sysuse auto
(1978 Automobile Data)

. correlate foreign mpg

             |  foreign      mpg
     foreign |   1.0000
         mpg |   0.3934   1.0000

. putexcel A1=matrix(r(C), names) using corr

These commands created the file corr.xlsx, which contained the table below in the first worksheet.


As you can see, this table is not formatted. So, I formatted the table by hand in Excel so that the correlations Read more…

Categories: Programming Tags: , , , , ,

Fitting ordered probit models with endogenous covariates with Stata’s gsem command

The new command gsem allows us to fit a wide variety of models; among the many possibilities, we can account for endogeneity on different models. As an example, I will fit an ordinal model with endogenous covariates.


Parameterizations for an ordinal probit model

The ordinal probit model is used to model ordinal dependent variables. In the usual parameterization, we assume that there is an underlying linear regression, which relates an unobserved continuous variable \(y^*\) to the covariates \(x\).

\[y^*_{i} = x_{i}\gamma + u_i\]

The observed dependent variable \(y\) relates to \(y^*\) through a series of cut-points \(-\infty =\kappa_0<\kappa_1<\dots< \kappa_m=+\infty\) , as follows:

\[y_{i} = j {\mbox{ if }} \kappa_{j-1} < y^*_{i} \leq \kappa_j\]

Provided that the variance of \(u_i\) can’t be identified from the observed data, it is assumed to be equal to one. However, we can consider a re-scaled parameterization for the same model; a straightforward way of seeing this, is by noting that, for any positive number \(M\):

\[\kappa_{j-1} < y^*_{i} \leq \kappa_j \iff
M\kappa_{j-1} < M y^*_{i} \leq M\kappa_j

that is,

\[\kappa_{j-1} < x_i\gamma + u_i \leq \kappa_j \iff
M\kappa_{j-1}< x_i(M\gamma) + Mu_i \leq M\kappa_j

In other words, if the model is identified, it can be represented by multiplying the unobserved variable \(y\) by a positive number, and this will mean that the standard error of the residual component, the coefficients, and the cut-points will be multiplied by this number.

Let me show you an example; I will first fit a standard ordinal probit model, both with oprobit and with gsem. Then, I will use gsem to fit an ordinal probit model where the residual term for the underlying linear regression has a standard deviation equal to 2. I will do this by introducing a latent variable \(L\), with variance 1, and coefficient \(\sqrt 3\). This will be added to the underlying latent residual, with variance 1; then, the ‘new’ residual term will have variance equal to \(1+((\sqrt 3)^2\times Var(L))= 4\), so the standard deviation will be 2. We will see that as a result, the coefficients, as well as the cut-points, will be multiplied by 2.

. sysuse auto, clear
(1978 Automobile Data)

. oprobit rep mpg disp , nolog

Ordered probit regression                         Number of obs   =         69
                                                  LR chi2(2)      =      14.68
                                                  Prob > chi2     =     0.0006
Log likelihood = -86.352646                       Pseudo R2       =     0.0783

       rep78 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
         mpg |   .0497185   .0355452     1.40   0.162    -.0199487    .1193858
displacement |  -.0029884   .0021498    -1.39   0.165     -.007202    .0012252
       /cut1 |  -1.570496   1.146391                      -3.81738    .6763888
       /cut2 |  -.7295982   1.122361                     -2.929386     1.47019
       /cut3 |   .6580529   1.107838                     -1.513269    2.829375
       /cut4 |    1.60884   1.117905                     -.5822132    3.799892

. gsem (rep <- mpg disp, oprobit), nolog

Generalized structural equation model             Number of obs   =         69
Log likelihood = -86.352646

               |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
rep78 <-       |
           mpg |   .0497185   .0355452     1.40   0.162    -.0199487    .1193858
  displacement |  -.0029884   .0021498    -1.39   0.165     -.007202    .0012252
rep78          |
         /cut1 |  -1.570496   1.146391    -1.37   0.171     -3.81738    .6763888
         /cut2 |  -.7295982   1.122361    -0.65   0.516    -2.929386     1.47019
         /cut3 |   .6580529   1.107838     0.59   0.553    -1.513269    2.829375
         /cut4 |    1.60884   1.117905     1.44   0.150    -.5822132    3.799892

. local a = sqrt(3)

. gsem (rep <- mpg disp L@`a'), oprobit var(L@1) nolog

Generalized structural equation model             Number of obs   =         69
Log likelihood = -86.353008

 ( 1)  [rep78]L = 1.732051
 ( 2)  [var(L)]_cons = 1
               |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
rep78 <-       |
           mpg |    .099532     .07113     1.40   0.162    -.0398802    .2389442
  displacement |  -.0059739   .0043002    -1.39   0.165    -.0144022    .0024544
             L |   1.732051  (constrained)
rep78          |
         /cut1 |  -3.138491   2.293613    -1.37   0.171     -7.63389    1.356907
         /cut2 |  -1.456712   2.245565    -0.65   0.517    -5.857938    2.944513
         /cut3 |   1.318568    2.21653     0.59   0.552     -3.02575    5.662887
         /cut4 |   3.220004   2.236599     1.44   0.150     -1.16365    7.603657
         var(L)|          1  (constrained)


Ordinal probit model with endogenous covariates

This model is defined analogously to the model fitted by -ivprobit- for probit models with endogenous covariates; we assume Read more…

Export tables to Excel

There is a new command in Stata 13, putexcel, that allows you to easily export matrices, expressions, and stored results to an Excel file. Combining putexcel with a Stata command’s stored results allows you to create the table displayed in your Stata Results window in an Excel file.

A stored result is simply a scalar, macro, or matrix stored in memory after you run a Stata command. The two main types of stored results are e-class (for estimation commands) and r-class (for general commands). You can list a command’s stored results after it has been run by typing ereturn list (for estimation commands) and return list (for general commands). Let’s try a simple example by loading the auto dataset and running correlate on the variables foreign and mpg

. sysuse auto
(1978 Automobile Data)

. correlate foreign mpg

             |  foreign      mpg
     foreign |   1.0000
         mpg |   0.3934   1.0000

Because correlate is not an estimation command, use the return list command to see its stored results.

. return list

                  r(N) =  74
                r(rho) =  .3933974152205484

                  r(C) :  2 x 2

Now we can use putexcel to export these results to Excel. The basic syntax of putexcel is

putexcel excel_cell=(expression) … using filename [, options]

If you are working with matrices, the syntax is

putexcel excel_cell=matrix(expression) … using filename [, options]

It is easy to build the above syntax in the putexcel dialog. There is a helpful video on Youtube about the dialog here. Let’s list the matrix r(C) to see what it contains.

. matrix list r(C)

symmetric r(C)[2,2]
           foreign        mpg
foreign          1
    mpg  .39339742          1

To re-create the table in Excel, we need to export the matrix r(C) with the matrix row and column names. The command to type in your Stata Command window is

putexcel A1=matrix(r(C), names) using corr

Note that to export the matrix row and column names, Read more…

Categories: Programming Tags: , , , ,

Measures of effect size in Stata 13

Today I want to talk about effect sizes such as Cohen’s d, Hedges’s g, Glass’s Δ, η2, and ω2. Effects sizes concern rescaling parameter estimates to make them easier to interpret, especially in terms of practical significance.

Many researchers in psychology and education advocate reporting of effect sizes, professional organizations such as the American Psychological Association (APA) and the American Educational Research Association (AERA) strongly recommend their reporting, and professional journals such as the Journal of Experimental Psychology: Applied and Educational and Psychological Measurement require that they be reported.

Anyway, today I want to show you

  1. What effect sizes are.
  2. How to calculate effect sizes and their confidence intervals in Stata.
  3. How to calculate bootstrap confidence intervals for those effect sizes.
  4. How to use Stata’s effect-size calculator.

1. What are effect sizes?

The importance of research results is often assessed by statistical significance, usually that the p-value is less than 0.05. P-values and statistical significance, however, don’t tell us anything about practical significance.

What if I told you that I had developed a new weight-loss pill and that the difference between the average weight loss for people who took the pill and the those who took a placebo was statistically significant? Would you buy my new pill? If you were overweight, you might reply, “Of course! I’ll take two bottles and a large order of french fries to go!”. Now let me add that the average difference in weight loss was only one pound over the year. Still interested? My results may be statistically significant but they are not practically significant.

Or what if I told you that the difference in weight loss was not statistically significant — the p-value was “only” 0.06 — but the average difference over the year was 20 pounds? You might very well be interested in that pill.

The size of the effect tells us about the practical significance. P-values do not assess practical significance.

All of which is to say, one should report parameter estimates along with statistical significance.

In my examples above, you knew that 1 pound over the year is small and 20 pounds is large because you are familiar with human weights.

In another context, 1 pound might be large, and in yet another, 20 pounds small.

Formal measures of effects sizes are thus usually presented in unit-free but easy-to-interpret form, such as standardized differences and proportions of variability explained.

The “d” family

Effect sizes that measure the scaled difference between means belong to the “d” family. The generic formula is

{delta} = {{mu}_1 - {mu}_2} / {sigma}

The estimators differ in terms of how sigma is calculated.

Cohen’s d, for instance, uses the pooled sample standard deviation.

Hedges’s g incorporates an adjustment which removes the bias of Cohen’s d.

Glass’s Δ was originally developed in the context of Read more…