Archive

Archive for June 2016

Flexible discrete choice modeling using a multinomial probit model, part 1

\(\newcommand{\xb}{{\bf x}}
\newcommand{\betab}{\boldsymbol{\beta}}
\newcommand{\zb}{{\bf z}}
\newcommand{\gammab}{\boldsymbol{\gamma}}\)We have no choice but to choose

We make choices every day, and often these choices are made among a finite number of potential alternatives. For example, do we take the car or ride a bike to get to work? Will we have dinner at home or eat out, and if we eat out, where do we go? Scientists, marketing analysts, or political consultants, to name a few, wish to find out why people choose what they choose.

In this post, Read more…

Unit-root tests in Stata

\(\newcommand{\mub}{{\boldsymbol{\mu}}}
\newcommand{\eb}{{\boldsymbol{e}}}
\newcommand{\betab}{\boldsymbol{\beta}}\)Determining the stationarity of a time series is a key step before embarking on any analysis. The statistical properties of most estimators in time series rely on the data being (weakly) stationary. Loosely speaking, a weakly stationary process is characterized by a time-invariant mean, variance, and autocovariance.

In most observed series, however, the presence of a trend component results in the series being nonstationary. Furthermore, the trend can be either deterministic or stochastic, depending on which appropriate transformations must be applied to obtain a stationary series. For example, a stochastic trend, or commonly known as a unit root, is eliminated by differencing the series. However, differencing a series that in fact contains a deterministic trend results in a unit root in the moving-average process. Similarly, subtracting a deterministic trend from a series that in fact contains a stochastic trend does not render a stationary series. Hence, it is important to identify whether nonstationarity is due to a deterministic or a stochastic trend before applying the proper transformations.

In this post, Read more…

Multiple equation models: Estimation and marginal effects using mlexp

We continue with the series of posts where we illustrate how to obtain correct standard errors and marginal effects for models with multiple steps. In this post, we estimate the marginal effects and standard errors for a hurdle model with two hurdles and a lognormal outcome using mlexp. mlexp allows us to estimate parameters for multiequation models using maximum likelihood. In the last post (Multiple equation models: Estimation and marginal effects using gsem), we used gsem to estimate marginal effects and standard errors for a hurdle model with two hurdles and an exponential mean outcome.

We exploit the fact that the hurdle-model likelihood is separable and the joint log likelihood is the sum of the individual hurdle and outcome log likelihoods. We estimate the parameters of each hurdle and the outcome separately to get initial values. Then, we use mlexp to estimate the parameters of the model and margins to obtain marginal effects. Read more…

Multiple equation models: Estimation and marginal effects using gsem

Starting point: A hurdle model with multiple hurdles

In a sequence of posts, we are going to illustrate how to obtain correct standard errors and marginal effects for models with multiple steps.

Our inspiration for this post is an old Statalist inquiry about how to obtain marginal effects for a hurdle model with more than one hurdle (http://www.statalist.org/forums/forum/general-stata-discussion/general/1337504-estimating-marginal-effect-for-triple-hurdle-model). Hurdle models have the appealing property that their likelihood is separable. Each hurdle has its own likelihood and regressors. You can estimate each one of these hurdles separately to obtain point estimates. However, you cannot get standard errors or marginal effects this way.

In this post, Read more…

Tests of forecast accuracy and forecast encompassing

\(\newcommand{\mub}{{\boldsymbol{\mu}}}
\newcommand{\eb}{{\boldsymbol{e}}}
\newcommand{\betab}{\boldsymbol{\beta}}\)Applied time-series researchers often want to compare the accuracy of a pair of competing forecasts. A popular statistic for forecast comparison is the mean squared forecast error (MSFE), a smaller value of which implies a better forecast. However, a formal test, such as Diebold and Mariano (1995), distinguishes whether the superiority of one forecast is statistically significant or is simply due to sampling variability.

A related test is the forecast encompassing test. This test is used to determine whether one of the forecasts encompasses all the relevant information from the other. The resulting test statistic may lead a researcher to either combine the two forecasts or drop the forecast that contains no additional information.

In this post, Read more…