Posts Tagged ‘simulation’

Calculating power using Monte Carlo simulations, part 2: Running your simulation using power

In my last post, I showed you how to calculate power for a t test using Monte Carlo simulations. In this post, I will show you how to integrate your simulations into Stata’s power command so that you can easily create custom tables and graphs for a range of parameter values. Read more…

Calculating power using Monte Carlo simulations, part 1: The basics

Power and sample-size calculations are an important part of planning a scientific study. You can use Stata’s power commands to calculate power and sample-size requirements for dozens of commonly used statistical tests. But there are no simple formulas for more complex models such as multilevel/longitudinal models and structural equation models (SEMs). Monte Carlo simulations are one way to calculate power and sample-size requirements for complex models, and Stata provides all the tools you need to do this. You can even integrate your simulations into Stata’s power commands so that you can easily create custom tables and graphs for a range of parameter values. Read more…

Introduction to Bayesian statistics, part 2: MCMC and the Metropolis–Hastings algorithm

In this blog post, I’d like to give you a relatively nontechnical introduction to Markov chain Monte Carlo, often shortened to “MCMC”. MCMC is frequently used for fitting Bayesian statistical models. There are different variations of MCMC, and I’m going to focus on the Metropolis–Hastings (M–H) algorithm. In the interest of brevity, I’m going to omit some details, and I strongly encourage you to read the [BAYES] manual before using MCMC in practice.

Let’s continue with the coin toss example from my previous post Introduction to Bayesian statistics, part 1: The basic concepts. We are interested in the posterior distribution of the parameter \(\theta\), which is the probability that a coin toss results in “heads”. Our prior distribution is a flat, uninformative beta distribution with parameters 1 and 1. And we will use a binomial likelihood function to quantify the data from our experiment, which resulted in 4 heads out of 10 tosses. Read more…

Flexible discrete choice modeling using a multinomial probit model, part 2


In the first part of this post, I discussed the multinomial probit model from a random utility model perspective. In this part, we will have a closer look at how to interpret our estimation results.

How do we interpret our estimation results?

We created a fictitious dataset of individuals who were presented a set of three health insurance plans (Sickmaster, Allgood, and Cowboy Health). We pretended to have a random sample of 20- to 60-year-old persons who were asked Read more…

Flexible discrete choice modeling using a multinomial probit model, part 1

\(\newcommand{\xb}{{\bf x}}
\newcommand{\zb}{{\bf z}}
\newcommand{\gammab}{\boldsymbol{\gamma}}\)We have no choice but to choose

We make choices every day, and often these choices are made among a finite number of potential alternatives. For example, do we take the car or ride a bike to get to work? Will we have dinner at home or eat out, and if we eat out, where do we go? Scientists, marketing analysts, or political consultants, to name a few, wish to find out why people choose what they choose.

In this post, Read more…

A simulation-based explanation of consistency and asymptotic normality


In the frequentist approach to statistics, estimators are random variables because they are functions of random data. The finite-sample distributions of most of the estimators used in applied work are not known, because the estimators are complicated nonlinear functions of random data. These estimators have large-sample convergence properties that we use to approximate their behavior in finite samples.

Two key convergence properties are consistency and asymptotic normality. A consistent estimator gets arbitrarily close in probability to the true value. The distribution of an asymptotically normal estimator gets arbitrarily close to a normal distribution as the sample size increases. We use a recentered and rescaled version of this normal distribution to approximate the finite-sample distribution of our estimators.

I illustrate the meaning of consistency and asymptotic normality by Monte Carlo simulation (MCS). I use some of the Stata mechanics I discussed in Monte Carlo simulations using Stata.

Consistent estimator

A consistent estimator gets arbitrarily close in Read more…

Vector autoregression—simulation, estimation, and inference in Stata

\newcommand{\Phat}{\hat{{\bf P}}}\)Vector autoregression (VAR) is a useful tool for analyzing the dynamics of multiple time series. VAR expresses a vector of observed variables as a function of its own lags.


Let’s begin by simulating a bivariate VAR(2) process using the following specification,

\begin{bmatrix} y_{1,t}\\ y_{2,t}
= \mub + {\bf A}_1 \begin{bmatrix} y_{1,t-1}\\ y_{2,t-1}
\end{bmatrix} + {\bf A}_2 \begin{bmatrix} y_{1,t-2}\\ y_{2,t-2}
\end{bmatrix} + \epsb_t

where \(y_{1,t}\) and \(y_{2,t}\) are the observed series at time \(t\), \(\mub\) is a \(2 \times 1\) vector of intercepts, \({\bf A}_1\) and \({\bf A}_2\) are \(2\times 2\) parameter matrices, and \(\epsb_t\) is a \(2\times 1\) vector of innovations that is uncorrelated over time. I assume a \(N({\bf 0},\Sigmab)\) distribution for the innovations \(\epsb_t\), where \(\Sigmab\) is a \(2\times 2\) covariance matrix.

I set my sample size to 1,100 and Read more…

regress, probit, or logit?

In a previous post I illustrated that the probit model and the logit model produce statistically equivalent estimates of marginal effects. In this post, I compare the marginal effect estimates from a linear probability model (linear regression) with marginal effect estimates from probit and logit models.

My simulations show that when the true model is a probit or a logit, using a linear probability model can produce inconsistent estimates of the marginal effects of interest to researchers. The conclusions hinge on the probit or logit model being the true model.

Simulation results

For all simulations below, I use a sample size of 10,000 and 5,000 replications. The true data-generating processes (DGPs) are constructed using Read more…

probit or logit: ladies and gentlemen, pick your weapon

We often use probit and logit models to analyze binary outcomes. A case can be made that the logit model is easier to interpret than the probit model, but Stata’s margins command makes any estimator easy to interpret. Ultimately, estimates from both models produce similar results, and using one or the other is a matter of habit or preference.

I show that the estimates from a probit and logit model are similar for the computation of a set of effects that are of interest to researchers. I focus on the effects of changes in the covariates on the probability of a positive outcome for continuous and discrete covariates. I evaluate these effects on average and at the mean value of the covariates. In other words, I study the average marginal effects (AME), the average treatment effects (ATE), the marginal effects at the mean values of the covariates (MEM), and the treatment effects at the mean values of the covariates (TEM).

First, I present the results. Second, I discuss the code used for the simulations.


In Table 1, I present the results of a simulation with 4,000 replications when the true data generating process (DGP) satisfies the assumptions of a probit model. I show the Read more…

Understanding the generalized method of moments (GMM): A simple example

\(\newcommand{\Eb}{{\bf E}}\)This post was written jointly with Enrique Pinzon, Senior Econometrician, StataCorp.

The generalized method of moments (GMM) is a method for constructing estimators, analogous to maximum likelihood (ML). GMM uses assumptions about specific moments of the random variables instead of assumptions about the entire distribution, which makes GMM more robust than ML, at the cost of some efficiency. The assumptions are called moment conditions.

GMM generalizes the method of moments (MM) by allowing the number of moment conditions to be greater than the number of parameters. Using these extra moment conditions makes GMM more efficient than MM. When there are more moment conditions than parameters, the estimator is said to be overidentified. GMM can efficiently combine the moment conditions when the estimator is overidentified.

We illustrate these points by estimating the mean of a \(\chi^2(1)\) by MM, ML, a simple GMM estimator, and an efficient GMM estimator. This example builds on Efficiency comparisons by Monte Carlo simulation and is similar in spirit to the example in Wooldridge (2001). Read more…