We discuss estimating population-averaged parameters when some of the data are missing. In particular, we show how to use **gmm** to estimate population-averaged parameters for a probit model when the process that causes some of the data to be missing is a function of observable covariates and a random process that is independent of the outcome. This type of missing data is known as missing at random, selection on observables, and exogenous sample selection.

This is a follow-up to an earlier post where we estimated the parameters of a probit model under endogenous sample selection (http://blog.stata.com/2015/11/05/using-mlexp-to-estimate-endogenous-treatment-effects-in-a-probit-model/). In endogenous sample selection, the random process that affects which observations are missing is correlated with an unobservable random process that affects the outcome. Read more…

In Stata 14.2, we added the ability to use **margins** to estimate covariate effects after **gmm**. In this post, I illustrate how to use **margins** and **marginsplot** after **gmm** to estimate covariate effects for a probit model.

Margins are statistics calculated from predictions of a previously fit model at fixed values of some covariates and averaging or otherwise integrating over the remaining covariates. They can be used to estimate population average parameters like the marginal mean, average treatment effect, or the average effect of a covariate on the conditional mean. I will demonstrate how using margins is useful after estimating a model with the generalized method of moments. Read more…

In a previous post I illustrated that the probit model and the logit model produce statistically equivalent estimates of marginal effects. In this post, I compare the marginal effect estimates from a linear probability model (linear regression) with marginal effect estimates from probit and logit models.

My simulations show that when the true model is a probit or a logit, using a linear probability model can produce inconsistent estimates of the marginal effects of interest to researchers. The conclusions hinge on the probit or logit model being the true model.

**Simulation results**

For all simulations below, I use a sample size of 10,000 and 5,000 replications. The true data-generating processes (DGPs) are constructed using Read more…

We often use probit and logit models to analyze binary outcomes. A case can be made that the logit model is easier to interpret than the probit model, but Stata’s **margins** command makes any estimator easy to interpret. Ultimately, estimates from both models produce similar results, and using one or the other is a matter of habit or preference.

I show that the estimates from a probit and logit model are similar for the computation of a set of effects that are of interest to researchers. I focus on the effects of changes in the covariates on the probability of a positive outcome for continuous and discrete covariates. I evaluate these effects on average and at the mean value of the covariates. In other words, I study the average marginal effects (AME), the average treatment effects (ATE), the marginal effects at the mean values of the covariates (MEM), and the treatment effects at the mean values of the covariates (TEM).

First, I present the results. Second, I discuss the code used for the simulations.

**Results**

In Table 1, I present the results of a simulation with 4,000 replications when the true data generating process (DGP) satisfies the assumptions of a probit model. I show the Read more…

I use features new to Stata 14.1 to estimate an average treatment effect (ATE) for a probit model with an endogenous treatment. In 14.1, we added new prediction statistics after **mlexp** that **margins** can use to estimate an ATE.

I am building on a previous post in which I demonstrated how to use **mlexp** to estimate the parameters of a probit model with sample selection. Our results match those obtained with **biprobit**; see [R] biprobit for more details. In a future post, I use these techniques to estimate treatment-effect parameters not yet available from another Stata command. Read more…

**Overview**

In a previous post, David Drukker demonstrated how to use **mlexp** to estimate the degree of freedom parameter in a chi-squared distribution by maximum likelihood (ML). In this post, I am going to use **mlexp** to estimate the parameters of a probit model with sample selection. I will illustrate how to specify a more complex likelihood in **mlexp** and provide intuition for the probit model with sample selection. Our results match the **heckprobit** command; see **[R] heckprobit** for more details. Read more…