In this blog post, I’d like to give you a relatively nontechnical introduction to Markov chain Monte Carlo, often shortened to “MCMC”. MCMC is frequently used for fitting Bayesian statistical models. There are different variations of MCMC, and I’m going to focus on the Metropolis–Hastings (M–H) algorithm. In the interest of brevity, I’m going to omit some details, and I strongly encourage you to read the [BAYES] manual before using MCMC in practice.

Let’s continue with the coin toss example from my previous post Introduction to Bayesian statistics, part 1: The basic concepts. We are interested in the posterior distribution of the parameter \(\theta\), which is the probability that a coin toss results in “heads”. Our prior distribution is a flat, uninformative beta distribution with parameters 1 and 1. And we will use a binomial likelihood function to quantify the data from our experiment, which resulted in 4 heads out of 10 tosses. Read more…

In this blog post, I’d like to give you a relatively nontechnical introduction to Bayesian statistics. The Bayesian approach to statistics has become increasingly popular, and you can fit Bayesian models using the **bayesmh** command in Stata. This blog entry will provide a brief introduction to the concepts and jargon of Bayesian statistics and the **bayesmh** syntax. In my next post, I will introduce the basics of Markov chain Monte Carlo (MCMC) using the Metropolis–Hastings algorithm. Read more…

Quantile regression models a quantile of the outcome as a function of covariates. Applied researchers use quantile regressions because they allow the effect of a covariate to differ across conditional quantiles. For example, another year of education may have a large effect on a low conditional quantile of income but a much smaller effect on a high conditional quantile of income. Also, another pack-year of cigarettes may have a larger effect on a low conditional quantile of bronchial effectiveness than on a high conditional quantile of bronchial effectiveness.

I use simulated data to illustrate what the conditional quantile functions estimated by quantile regression are and what the estimable covariate effects are. Read more…

**teffects ipw** uses multinomial logit to estimate the weights needed to estimate the potential-outcome means (POMs) from a multivalued treatment. I show how to estimate the POMs when the weights come from an ordered probit model. Moment conditions define the ordered probit estimator and the subsequent weighted average used to estimate the POMs. I use **gmm** to obtain consistent standard errors by stacking the ordered-probit moment conditions and the weighted mean moment conditions. Read more…

I illustrate that exact matching on discrete covariates and regression adjustment (RA) with fully interacted discrete covariates perform the same nonparametric estimation. Read more…

\(\newcommand{\Eb}{{\bf E}}

\newcommand{\xb}{{\bf x}}

\newcommand{\betab}{\boldsymbol{\beta}}\)Differences in conditional probabilities and ratios of odds are two common measures of the effect of a covariate in binary-outcome models. I show how these measures differ in terms of conditional-on-covariate effects versus population-parameter effects. Read more…

\(\newcommand{\Eb}{{\bf E}}\)The change in a regression function that results from an everything-else-held-equal change in a covariate defines an effect of a covariate. I am interested in estimating and interpreting effects that are conditional on the covariates and averages of effects that vary over the individuals. I illustrate that these two types of effects answer different questions. Doctors, parents, and consultants frequently ask individuals for their covariate values to make individual-specific recommendations. Policy analysts use a population-averaged effect that accounts for the variation of the effects over the individuals. Read more…

Distributing a Stata command that implements a statistical method will get that method used by lots of people. They will thank you. And, they will cite you!

This post is the first in the series #StataProgramming about programing an estimation command in Stata that uses Mata to do the numerical work. In the process of showing you how to program an estimation command in Stata, I will discuss do-file programming, ado-file programming, and Mata programming. When the series ends, you will be able to write Stata commands.

Stata users like its predictable syntax and its estimation-postestimation structure that facilitates hypothesis testing, specification tests, and parameter interpretation. To help you write Stata commands that people want to use, I illustrate how Stata syntax is predictable and give an overview of the estimation-postestimation structure that you will want to emulate in your programs. Read more…

\(\newcommand{\epsilonb}{\boldsymbol{\epsilon}}

\newcommand{\ebi}{\boldsymbol{\epsilon}_i}

\newcommand{\Sigmab}{\boldsymbol{\Sigma}}

\newcommand{\Omegab}{\boldsymbol{\Omega}}

\newcommand{\Lambdab}{\boldsymbol{\Lambda}}

\newcommand{\betab}{\boldsymbol{\beta}}

\newcommand{\gammab}{\boldsymbol{\gamma}}

\newcommand{\Gammab}{\boldsymbol{\Gamma}}

\newcommand{\deltab}{\boldsymbol{\delta}}

\newcommand{\xib}{\boldsymbol{\xi}}

\newcommand{\iotab}{\boldsymbol{\iota}}

\newcommand{\xb}{{\bf x}}

\newcommand{\xbit}{{\bf x}_{it}}

\newcommand{\xbi}{{\bf x}_{i}}

\newcommand{\zb}{{\bf z}}

\newcommand{\zbi}{{\bf z}_i}

\newcommand{\wb}{{\bf w}}

\newcommand{\yb}{{\bf y}}

\newcommand{\ub}{{\bf u}}

\newcommand{\Gb}{{\bf G}}

\newcommand{\Hb}{{\bf H}}

\newcommand{\thetab}{\boldsymbol{\theta}}

\newcommand{\XBI}{{\bf x}_{i1},\ldots,{\bf x}_{iT}}

\newcommand{\Sb}{{\bf S}} \newcommand{\Xb}{{\bf X}}

\newcommand{\Xtb}{\tilde{\bf X}}

\newcommand{\Wb}{{\bf W}}

\newcommand{\Ab}{{\bf A}}

\newcommand{\Bb}{{\bf B}}

\newcommand{\Zb}{{\bf Z}}

\newcommand{\Eb}{{\bf E}}\) This post was written jointly with Joerg Luedicke, Senior Social Scientist and Statistician, StataCorp.

**Overview**

We provide an introduction to parameter estimation by maximum likelihood and method of moments using **mlexp** and **gmm**, respectively (see **[R] mlexp** and **[R] gmm**). We include some background about these estimation techniques; see Pawitan (2001, Casella and Berger (2002), Cameron and Trivedi (2005), and Wooldridge (2010) for more details.

Maximum likelihood (ML) estimation finds the parameter values that make the observed data most probable. The parameters maximize the log of the likelihood function that specifies the probability of observing a particular set of data given a model.

Method of moments (MM) estimators specify population moment conditions and find the parameters that solve the equivalent sample moment conditions. MM estimators usually place fewer restrictions on the model than ML estimators, which implies that MM estimators are less efficient but more robust than ML estimators. Read more…

**gsem** is a very flexible command that allows us to fit very sophisticated models. However, it is also useful in situations that involve simple models.

For example, when we want to compare parameters among two or more models, we usually use **suest**, which combines the estimation results under one parameter vector and creates a simultaneous covariance matrix of the robust type. This covariance estimate is described in the *Methods and formulas* of **[R] suest** as the robust variance from a “stacked model”. Actually, **gsem** can estimate these kinds of “stacked models”, even if the estimation samples are not the same and eventually overlap. By using the option **vce(robust)**, we can replicate the results from **suest** if the models are available for **gsem**. In addition, **gsem** allows us to combine results from some estimation commands that are not supported by **suest**, like models including random effects. Read more…