biostatistics Archives - Page 2 of 3

Quantile regression allows covariate effects to differ by quantile

27 September 2016 David M. Drukker, Executive Director of Econometrics No comments

Quantile regression models a quantile of the outcome as a function of covariates. Applied researchers use quantile regressions because they allow the effect of a covariate to differ across conditional quantiles. For example, another year of education may have a large effect on a low conditional quantile of income but a much smaller effect on a high conditional quantile of income. Also, another pack-year of cigarettes may have a larger effect on a low conditional quantile of bronchial effectiveness than on a high conditional quantile of bronchial effectiveness.

I use simulated data to illustrate what the conditional quantile functions estimated by quantile regression are and what the estimable covariate effects are. Read more…

Categories: Statistics Tags: biostatistics, effect estimation, estimation, quantile, regression

An ordered-probit inverse probability weighted (IPW) estimator

13 September 2016 David M. Drukker, Executive Director of Econometrics 3 comments

teffects ipw uses multinomial logit to estimate the weights needed to estimate the potential-outcome means (POMs) from a multivalued treatment. I show how to estimate the POMs when the weights come from an ordered probit model. Moment conditions define the ordered probit estimator and the subsequent weighted average used to estimate the POMs. I use gmm to obtain consistent standard errors by stacking the ordered-probit moment conditions and the weighted mean moment conditions. Read more…

Categories: Statistics Tags: biostatistics, causal inference, estimation, gmm, IPW, treatment effects

Two faces of misspecification in maximum likelihood: Heteroskedasticity and robust standard errors

30 August 2016 Enrique Pinzon, Associate Director Econometrics 3 comments

For a nonlinear model with heteroskedasticity, a maximum likelihood estimator gives misleading inference and inconsistent marginal effect estimates unless I model the variance. Using a robust estimate of the variance–covariance matrix will not help me obtain correct inference.

This differs from the intuition we gain from linear regression. The estimates of the marginal effects in linear regression are consistent under heteroskedasticity and using robust standard errors yields correct inference.

If robust standard errors do not solve the problems associated with heteroskedasticity for a nonlinear model estimated using maximum likelihood, what does it mean to use robust standard errors in this context? I answer this question using simulations and illustrate the effect of heteroskedasticity in nonlinear models estimated using maximum likelihood. Read more…

Categories: Statistics Tags: biostatistics, heteroskedasticity, marginal effects, pseudolikelihood, quasilikelihood, robust variance

Exact matching on discrete covariates is the same as regression adjustment

16 August 2016 David M. Drukker, Executive Director of Econometrics No comments

I illustrate that exact matching on discrete covariates and regression adjustment (RA) with fully interacted discrete covariates perform the same nonparametric estimation. Read more…

Categories: Statistics Tags: biostatistics, causal inference, estimation, matching, regression adjustment, treatment effect

Multiple-equation models: Estimation and marginal effects using gmm

2 August 2016 Charles Lindsey, Senior Statistician and Software Developer, and Enrique Pinzon, Senior Econometrician No comments

We estimate the average treatment effect (ATE) for an exponential mean model with an endogenous treatment. We have a two-step estimation problem where the first step corresponds to the treatment model and the second to the outcome model. As shown in Using gmm to solve two-step estimation problems, this can be solved with the generalized method of moments using gmm.

This continues the series of posts where we illustrate how to obtain correct standard errors and marginal effects for models with multiple steps. In the previous posts, we used gsem and mlexp to estimate the parameters of models with separable likelihoods. In the current model, because the treatment is endogenous, the likelihood for the model is no longer separable. We demonstrate how we can use gmm to estimate the parameters in these situations. Read more…

Categories: Statistics Tags: biostatistics, exponential mean, gmm, macro, marginal effects, multistep, poisson, treatment effects, twostep

Probability differences and odds ratios measure conditional-on-covariate effects and population-parameter effects

26 July 2016 David M. Drukker, Executive Director of Econometrics 2 comments

\(\newcommand{\Eb}{{\bf E}}
\newcommand{\xb}{{\bf x}}
\newcommand{\betab}{\boldsymbol{\beta}}\)Differences in conditional probabilities and ratios of odds are two common measures of the effect of a covariate in binary-outcome models. I show how these measures differ in terms of conditional-on-covariate effects versus population-parameter effects. Read more…

Categories: Statistics Tags: binary outcome, biostatistics, effects, estimation, interpretation, logit, odds ratio

Doctors versus policy analysts: Estimating the effect of interest

19 July 2016 David M. Drukker, Executive Director of Econometrics 2 comments

\(\newcommand{\Eb}{{\bf E}}\)The change in a regression function that results from an everything-else-held-equal change in a covariate defines an effect of a covariate. I am interested in estimating and interpreting effects that are conditional on the covariates and averages of effects that vary over the individuals. I illustrate that these two types of effects answer different questions. Doctors, parents, and consultants frequently ask individuals for their covariate values to make individual-specific recommendations. Policy analysts use a population-averaged effect that accounts for the variation of the effects over the individuals. Read more…

Categories: Statistics Tags: biostatistics, effects, estimation, interpretation

Effects of nonlinear models with interactions of discrete and continuous variables: Estimating, graphing, and interpreting

12 July 2016 Enrique Pinzon, Associate Director Econometrics 5 comments

I want to estimate, graph, and interpret the effects of nonlinear models with interactions of continuous and discrete variables. The results I am after are not trivial, but obtaining what I want using margins, marginsplot, and factor-variable notation is straightforward. Read more…

Categories: Statistics Tags: biostatistics, factor variables, interaction effects, interaction terms, margins

Gelman–Rubin convergence diagnostic using multiple chains

26 May 2016 Nikolay Balov, Associate Director, Bayesian Statistics 1 comment

As of Stata 16, see [BAYES] bayesstats grubin and Bayesian analysis: Gelman-Rubin convergence diagnostic.

The original blog posted May 26, 2016, omitted option initrandom from the bayesmh command. The code and the text of the blog entry were updated on August 9, 2018, to reflect this.

Overview

MCMC algorithms used for simulating posterior distributions are indispensable tools in Bayesian analysis. A major consideration in MCMC simulations is that of convergence. Has the simulated Markov chain fully explored the target posterior distribution so far, or do we need longer simulations? A common approach in assessing MCMC convergence is based on running and analyzing the difference between multiple chains.

For a given Bayesian model, bayesmh is capable of producing multiple Markov chains with randomly dispersed initial values by using the initrandom option, available as of the update on 19 May 2016. In this post, I demonstrate the Gelman–Rubin diagnostic as a more formal test for convergence using multiple chains. For graphical diagnostics, see Graphical diagnostics using multiple chains in [BAYES] bayesmh for more details. To compute the Gelman–Rubin diagnostic, I use an unofficial command, grubin, which can be installed by typing the following in Stata: Read more…

Categories: Statistics Tags: Bayesian, biostatistics, Gelman-Rubin statistic, multiple chains

Understanding omitted confounders, endogeneity, omitted variable bias, and related concepts

23 May 2016 Enrique Pinzon, Associate Director Econometrics 2 comments

Initial thoughts

Estimating causal relationships from data is one of the fundamental endeavors of researchers. Ideally, we could conduct a controlled experiment to estimate causal relations. However, conducting a controlled experiment may be infeasible. For example, education researchers cannot randomize education attainment and they must learn from observational data.

In the absence of experimental data, we construct models to capture the relevant features of the causal relationship we have an interest in, using observational data. Models are successful if the features we did not include can be ignored without affecting our ability to ascertain the causal relationship we are interested in. Sometimes, however, ignoring some features of reality results in models that yield relationships that cannot be interpreted causally. In a regression framework, depending on our discipline or our research question, we give a different name to this phenomenon: endogeneity, omitted confounders, omitted variable bias, simultaneity bias, selection bias, etc.

Below I show how we can understand many of these problems in a unified regression framework and use simulated data to illustrate how they affect estimation and inference. Read more…

Categories: Statistics Tags: biostatistics, endogeneity, omitted confounders, omitted variable bias

Newer Entries Older Entries

Archive

Quantile regression allows covariate effects to differ by quantile

An ordered-probit inverse probability weighted (IPW) estimator

Two faces of misspecification in maximum likelihood: Heteroskedasticity and robust standard errors

Exact matching on discrete covariates is the same as regression adjustment

Multiple-equation models: Estimation and marginal effects using gmm

Probability differences and odds ratios measure conditional-on-covariate effects and population-parameter effects

Doctors versus policy analysts: Estimating the effect of interest

Effects of nonlinear models with interactions of discrete and continuous variables: Estimating, graphing, and interpreting

Gelman–Rubin convergence diagnostic using multiple chains

Understanding omitted confounders, endogeneity, omitted variable bias, and related concepts

Subscribe to the Stata Blog

Recent articles

Archives

Categories

Links