Archive for October 2015

Fixed effects or random effects: The Mundlak approach

Today I will discuss Mundlak’s (1978) alternative to the Hausman test. Unlike the latter, the Mundlak approach may be used when the errors are heteroskedastic or have intragroup correlation. Read more…

Programming an estimation command in Stata: Where to store your stuff

If you tell me “I program in Stata”, it makes me happy, but I do not know what you mean. Do you write scripts to make your research reproducible, or do you write Stata commands that anyone can use and reuse? In the series #StataProgramming, I will show you how to write your own commands, but I start at the beginning. Discussing the difference between scripts and commands here introduces some essential programming concepts and constructions that I use to write scripts and commands.

This is the second post in the series Programming an estimation command in Stata. I recommend that you start at the beginning. See Programming an estimation command in Stata: A map to posted entries for a map to all the posts in this series. Read more…

Probit model with sample selection by mlexp


In a previous post, David Drukker demonstrated how to use mlexp to estimate the degree of freedom parameter in a chi-squared distribution by maximum likelihood (ML). In this post, I am going to use mlexp to estimate the parameters of a probit model with sample selection. I will illustrate how to specify a more complex likelihood in mlexp and provide intuition for the probit model with sample selection. Our results match the heckprobit command; see [R] heckprobit for more details. Read more…

Categories: Statistics Tags: , ,

Programming estimators in Stata: Why you should

Distributing a Stata command that implements a statistical method will get that method used by lots of people. They will thank you. And, they will cite you!

This post is the first in the series #StataProgramming about programing an estimation command in Stata that uses Mata to do the numerical work. In the process of showing you how to program an estimation command in Stata, I will discuss do-file programming, ado-file programming, and Mata programming. When the series ends, you will be able to write Stata commands.

Stata users like its predictable syntax and its estimation-postestimation structure that facilitates hypothesis testing, specification tests, and parameter interpretation. To help you write Stata commands that people want to use, I illustrate how Stata syntax is predictable and give an overview of the estimation-postestimation structure that you will want to emulate in your programs. Read more…

Estimating parameters by maximum likelihood and method of moments using mlexp and gmm

\newcommand{\xb}{{\bf x}}
\newcommand{\xbit}{{\bf x}_{it}}
\newcommand{\xbi}{{\bf x}_{i}}
\newcommand{\zb}{{\bf z}}
\newcommand{\zbi}{{\bf z}_i}
\newcommand{\wb}{{\bf w}}
\newcommand{\yb}{{\bf y}}
\newcommand{\ub}{{\bf u}}
\newcommand{\Gb}{{\bf G}}
\newcommand{\Hb}{{\bf H}}
\newcommand{\XBI}{{\bf x}_{i1},\ldots,{\bf x}_{iT}}
\newcommand{\Sb}{{\bf S}} \newcommand{\Xb}{{\bf X}}
\newcommand{\Xtb}{\tilde{\bf X}}
\newcommand{\Wb}{{\bf W}}
\newcommand{\Ab}{{\bf A}}
\newcommand{\Bb}{{\bf B}}
\newcommand{\Zb}{{\bf Z}}
\newcommand{\Eb}{{\bf E}}\) This post was written jointly with Joerg Luedicke, Senior Social Scientist and Statistician, StataCorp.


We provide an introduction to parameter estimation by maximum likelihood and method of moments using mlexp and gmm, respectively (see [R] mlexp and [R] gmm). We include some background about these estimation techniques; see Pawitan (2001, Casella and Berger (2002), Cameron and Trivedi (2005), and Wooldridge (2010) for more details.

Maximum likelihood (ML) estimation finds the parameter values that make the observed data most probable. The parameters maximize the log of the likelihood function that specifies the probability of observing a particular set of data given a model.

Method of moments (MM) estimators specify population moment conditions and find the parameters that solve the equivalent sample moment conditions. MM estimators usually place fewer restrictions on the model than ML estimators, which implies that MM estimators are less efficient but more robust than ML estimators. Read more…

Efficiency comparisons by Monte Carlo simulation


In this post, I show how to use Monte Carlo simulations to compare the efficiency of different estimators. I also illustrate what we mean by efficiency when discussing statistical estimators.

I wrote this post to continue a dialog with my friend who doubted the usefulness of the sample average as an estimator for the mean when the data-generating process (DGP) is a \(\chi^2\) distribution with \(1\) degree of freedom, denoted by a \(\chi^2(1)\) distribution. The sample average is a fine estimator, even though it is not the most efficient estimator for the mean. (Some researchers prefer to estimate the median instead of the mean for DGPs that generate outliers. I will address the trade-offs between these parameters in a future post. For now, I want to stick to estimating the mean.)

In this post, I also want to illustrate that Monte Carlo simulations can help explain abstract statistical concepts. I show how to use a Monte Carlo simulation to illustrate the meaning of an abstract statistical concept. (If you are new to Monte Carlo simulations in Stata, you might want to see Monte Carlo simulations using Stata.) Read more…

Maximum likelihood estimation by mlexp: A chi-squared example


In this post, I show how to use mlexp to estimate the degree of freedom parameter of a chi-squared distribution by maximum likelihood (ML). One example is unconditional, and another example models the parameter as a function of covariates. I also show how to generate data from chi-squared distributions and I illustrate how to use simulation methods to understand an estimation technique. Read more…

Monte Carlo simulations using Stata


A Monte Carlo simulation (MCS) of an estimator approximates the sampling distribution of an estimator by simulation methods for a particular data-generating process (DGP) and sample size. I use an MCS to learn how well estimation techniques perform for specific DGPs. In this post, I show how to perform an MCS study of an estimator in Stata and how to interpret the results.

Large-sample theory tells us that the sample average is a good estimator for the mean when the true DGP is a random sample from a \(\chi^2\) distribution with 1 degree of freedom, denoted by \(\chi^2(1)\). But a friend of mine claims this estimator will not work well for this DGP because the \(\chi^2(1)\) distribution will produce outliers. In this post, I use an MCS to see if the large-sample theory works well for this DGP in a sample of 500 observations. Read more…