Archive

Archive for August 2016

Two faces of misspecification in maximum likelihood: Heteroskedasticity and robust standard errors

For a nonlinear model with heteroskedasticity, a maximum likelihood estimator gives misleading inference and inconsistent marginal effect estimates unless I model the variance. Using a robust estimate of the variance–covariance matrix will not help me obtain correct inference.

This differs from the intuition we gain from linear regression. The estimates of the marginal effects in linear regression are consistent under heteroskedasticity and using robust standard errors yields correct inference.

If robust standard errors do not solve the problems associated with heteroskedasticity for a nonlinear model estimated using maximum likelihood, what does it mean to use robust standard errors in this context? I answer this question using simulations and illustrate the effect of heteroskedasticity in nonlinear models estimated using maximum likelihood. Read more…

Group comparisons in structural equation models: Testing measurement invariance

When fitting almost any model, we may be interested in investigating whether parameters differ across groups such as time periods, age groups, gender, or school attended. In other words, we may wish to perform tests of moderation when the moderator variable is categorical. For regression models, this can be as simple as including group indicators in the model and interacting them with other predictors.

We naturally have hypotheses regarding differences in parameters across groups when fitting structural equation models as well. When these models involve latent variables and the corresponding observed measurements, we can test whether those measurements are invariant across groups. Evaluation of measurement invariance typically involves a series of tests for equality of measurement coefficients (factor loadings), equality of intercepts, and equality of error variances across groups.

In this post, I demonstrate how to use the sem command’s group() and ginvariant() options as well as the postestimation command estat ginvariant to easily perform tests of measurement invariance. Read more…

Exact matching on discrete covariates is the same as regression adjustment

I illustrate that exact matching on discrete covariates and regression adjustment (RA) with fully interacted discrete covariates perform the same nonparametric estimation. Read more…

Vector autoregressions in Stata

Introduction

In a univariate autoregression, a stationary time-series variable \(y_t\) can often be modeled as depending on its own lagged values:

\begin{align}
y_t = \alpha_0 + \alpha_1 y_{t-1} + \alpha_2 y_{t-2} + \dots
+ \alpha_k y_{t-k} + \varepsilon_t
\end{align}

When one analyzes multiple time series, the natural extension to the autoregressive model is the vector autoregression, or VAR, in which a vector of variables is modeled as depending on their own lags and on the lags of every other variable in the vector. A two-variable VAR with one lag looks like

\begin{align}
y_t &= \alpha_{0} + \alpha_{1} y_{t-1} + \alpha_{2} x_{t-1}
+ \varepsilon_{1t} \\
x_t &= \beta_0 + \beta_{1} y_{t-1} + \beta_{2} x_{t-1}
+ \varepsilon_{2t}
\end{align}

Applied macroeconomists use models of this form to both describe macroeconomic data and to perform causal inference and provide policy advice.

In this post, I will estimate a three-variable VAR using the U.S. unemployment rate, the inflation rate, and the nominal interest rate. This VAR is similar to those used in macroeconomics for monetary policy analysis. I focus on basic issues in estimation and postestimation. Data and do-files are provided at the end. Additional background and theoretical details can be found in Ashish Rajbhandari’s [earlier post], which explored VAR estimation using simulated data. Read more…

Multiple-equation models: Estimation and marginal effects using gmm

We estimate the average treatment effect (ATE) for an exponential mean model with an endogenous treatment. We have a two-step estimation problem where the first step corresponds to the treatment model and the second to the outcome model. As shown in Using gmm to solve two-step estimation problems, this can be solved with the generalized method of moments using gmm.

This continues the series of posts where we illustrate how to obtain correct standard errors and marginal effects for models with multiple steps. In the previous posts, we used gsem and mlexp to estimate the parameters of models with separable likelihoods. In the current model, because the treatment is endogenous, the likelihood for the model is no longer separable. We demonstrate how we can use gmm to estimate the parameters in these situations. Read more…