Archive

Posts Tagged ‘biostatistics’

Fitting distributions using bayesmh

This post was written jointly with Yulia Marchenko, Executive Director of Statistics, StataCorp.

As of update 03 Mar 2016, bayesmh provides a more convenient way of fitting distributions to the outcome variable. By design, bayesmh is a regression command, which models the mean of the outcome distribution as a function of predictors. There are cases when we do not have any predictors and want to model the outcome distribution directly. For example, we may want to fit a Poisson distribution or a binomial distribution to our outcome. This can now be done by specifying one of the four new distributions supported by bayesmh in the likelihood() option: dexponential(), dbernoulli(), dbinomial(), or dpoisson(). Previously, the suboption noglmtransform of bayesmh‘s option likelihood() was used to fit the exponential, binomial, and Poisson distributions to the outcome variable. This suboption continues to work but is now undocumented.

For examples, see Beta-binomial model, Bayesian analysis of change-point problem, and Item response theory under Remarks and examples in [BAYES] bayesmh.

We have also updated our earlier “Bayesian binary item response theory models using bayesmh” blog entry to use the new dbernoulli() specification when fitting 3PL, 4PL, and 5PL IRT models.

Using mlexp to estimate endogenous treatment effects in a heteroskedastic probit model

I use features new to Stata 14.1 to estimate an average treatment effect (ATE) for a heteroskedastic probit model with an endogenous treatment. In 14.1, we added new prediction statistics after mlexp that margins can use to estimate an ATE.

I am building on a previous post in which I demonstrated how to use mlexp to estimate the parameters of a probit model with an endogenous treatment and used margins to estimate the ATE for the model Using mlexp to estimate endogenous treatment effects in a probit model. Currently, no official commands estimate the heteroskedastic probit model with an endogenous treatment, so in this post I show how mlexp can be used to extend the models estimated by Stata. Read more…

Estimating parameters by maximum likelihood and method of moments using mlexp and gmm

\(\newcommand{\epsilonb}{\boldsymbol{\epsilon}}
\newcommand{\ebi}{\boldsymbol{\epsilon}_i}
\newcommand{\Sigmab}{\boldsymbol{\Sigma}}
\newcommand{\Omegab}{\boldsymbol{\Omega}}
\newcommand{\Lambdab}{\boldsymbol{\Lambda}}
\newcommand{\betab}{\boldsymbol{\beta}}
\newcommand{\gammab}{\boldsymbol{\gamma}}
\newcommand{\Gammab}{\boldsymbol{\Gamma}}
\newcommand{\deltab}{\boldsymbol{\delta}}
\newcommand{\xib}{\boldsymbol{\xi}}
\newcommand{\iotab}{\boldsymbol{\iota}}
\newcommand{\xb}{{\bf x}}
\newcommand{\xbit}{{\bf x}_{it}}
\newcommand{\xbi}{{\bf x}_{i}}
\newcommand{\zb}{{\bf z}}
\newcommand{\zbi}{{\bf z}_i}
\newcommand{\wb}{{\bf w}}
\newcommand{\yb}{{\bf y}}
\newcommand{\ub}{{\bf u}}
\newcommand{\Gb}{{\bf G}}
\newcommand{\Hb}{{\bf H}}
\newcommand{\thetab}{\boldsymbol{\theta}}
\newcommand{\XBI}{{\bf x}_{i1},\ldots,{\bf x}_{iT}}
\newcommand{\Sb}{{\bf S}} \newcommand{\Xb}{{\bf X}}
\newcommand{\Xtb}{\tilde{\bf X}}
\newcommand{\Wb}{{\bf W}}
\newcommand{\Ab}{{\bf A}}
\newcommand{\Bb}{{\bf B}}
\newcommand{\Zb}{{\bf Z}}
\newcommand{\Eb}{{\bf E}}\) This post was written jointly with Joerg Luedicke, Senior Social Scientist and Statistician, StataCorp.

Overview

We provide an introduction to parameter estimation by maximum likelihood and method of moments using mlexp and gmm, respectively (see [R] mlexp and [R] gmm). We include some background about these estimation techniques; see Pawitan (2001, Casella and Berger (2002), Cameron and Trivedi (2005), and Wooldridge (2010) for more details.

Maximum likelihood (ML) estimation finds the parameter values that make the observed data most probable. The parameters maximize the log of the likelihood function that specifies the probability of observing a particular set of data given a model.

Method of moments (MM) estimators specify population moment conditions and find the parameters that solve the equivalent sample moment conditions. MM estimators usually place fewer restrictions on the model than ML estimators, which implies that MM estimators are less efficient but more robust than ML estimators. Read more…

Efficiency comparisons by Monte Carlo simulation

Overview

In this post, I show how to use Monte Carlo simulations to compare the efficiency of different estimators. I also illustrate what we mean by efficiency when discussing statistical estimators.

I wrote this post to continue a dialog with my friend who doubted the usefulness of the sample average as an estimator for the mean when the data-generating process (DGP) is a \(\chi^2\) distribution with \(1\) degree of freedom, denoted by a \(\chi^2(1)\) distribution. The sample average is a fine estimator, even though it is not the most efficient estimator for the mean. (Some researchers prefer to estimate the median instead of the mean for DGPs that generate outliers. I will address the trade-offs between these parameters in a future post. For now, I want to stick to estimating the mean.)

In this post, I also want to illustrate that Monte Carlo simulations can help explain abstract statistical concepts. I show how to use a Monte Carlo simulation to illustrate the meaning of an abstract statistical concept. (If you are new to Monte Carlo simulations in Stata, you might want to see Monte Carlo simulations using Stata.) Read more…

Maximum likelihood estimation by mlexp: A chi-squared example

Overview

In this post, I show how to use mlexp to estimate the degree of freedom parameter of a chi-squared distribution by maximum likelihood (ML). One example is unconditional, and another example models the parameter as a function of covariates. I also show how to generate data from chi-squared distributions and I illustrate how to use simulation methods to understand an estimation technique. Read more…

Monte Carlo simulations using Stata

Overview

A Monte Carlo simulation (MCS) of an estimator approximates the sampling distribution of an estimator by simulation methods for a particular data-generating process (DGP) and sample size. I use an MCS to learn how well estimation techniques perform for specific DGPs. In this post, I show how to perform an MCS study of an estimator in Stata and how to interpret the results.

Large-sample theory tells us that the sample average is a good estimator for the mean when the true DGP is a random sample from a \(\chi^2\) distribution with 1 degree of freedom, denoted by \(\chi^2(1)\). But a friend of mine claims this estimator will not work well for this DGP because the \(\chi^2(1)\) distribution will produce outliers. In this post, I use an MCS to see if the large-sample theory works well for this DGP in a sample of 500 observations. Read more…