The Stata Blog » Stata 15 announced, available now

Home > Graphics, New Products, Programming, Statistics > Stata 15 announced, available now

Stata 15 announced, available now

6 June 2017 William Gould, President Emeritus Go to comments

We announced Stata 15 today. It’s a big deal because this is Stata’s biggest release ever.

I posted to Statalist this morning and listed sixteen of the most important new features. Here on the blog I will say more about them, and you can learn even more by visiting our website and seeing the Stata 15 features page.

I go into depth below on the sixteen highlighted features. They are (click to jump)

Extended regression models
Latent class analysis (LCA)
Bayesian prefix command
Linearized dynamic stochastic general equilibrium (DSGE) models
Dynamic Markdown documents for the web
Nonlinear mixed-effects models
Spatial autoregressive models (SAR)
Interval-censored parametric survival-time models
Finite mixture models (FMMs)
Mixed logit models
Nonparametric regression
Power analysis for cluster randomized designs and regression models
Word and PDF documents
Graph color transparency/opacity
ICD-10-CM/PCS support
Federal Reserve Economic Data (FRED) support
And more

The sixteen features listed above certainly important ones, but there are others worthy of mention. More come readily to mind:

Bayesian multilevel models
Threshold regression
Panel-data tobit with random coefficients
Multilevel regression for interval-measured outcomes
Multilevel tobit regression for censored outcomes
Panel data cointegration tests
Tests for multiple breaks in time series
Multiple-group generalized SEM
Heteroskedastic linear regression
Poisson models with Heckman-style sample selection
Panel-data nonlinear models with random coefficients
Bayesian panel-data models
Panel-data interval regression with random coefficients
SVG export
Bayesian survival models
Zero-inflated ordered probit
Add your own power and sample-size methods
Bayesian sample-selection models
Stata in Swedish
Improvements to the Do-file Editor
Stream random-number generator
Improvements for Java plugins
More parallelization in Stata/MP

1. Extended regression models

We call them ERMs—extended regression models. Four new commands fit

linear regressions,
interval regressions including tobit,
probit, and
ordered probit models

with any combination of

endogenous covariates,
nonrandom treatment assignment, and
endogenous (Heckman-style) sample selection.

These new commands are just short of amazing because you can put endogenous covariates in any of the equations, and that includes the treatment-assignment and probit-selection equations. And endogenous covariates are not limited to being continuous. They can be binary or ordinal. And they can be interacted with other covariates, whether exogenous or endogenous. They can even be interacted with themselves to form squared or cubic terms!

These new ERM commands—eregress, eintreg, eprobit, and eoprobit—are destined to become popular because they address so many of the problems researchers have. First, you might have an endogenous variable because lots of models have omitted variables that are correlated with the variables in the model. Next, data are often censored, and the censoring is not random. ERM sample-selection options allow you to model the sample-selection process and so adjust for it. Or if you are fitting a treatment-effects model with nonrandom assignment, you can use ERM treatment-assignment options. Or you can combine the treatment-assignment and selection options, which will be of special interest to those fitting endogenous treatment-assignment models in which some are lost because of follow-up.

The syntax is simple:

. eregress   y x1 x2

. eregress   y x1 x2,  endogenous(       x2 = x3 x4, nomain)

. eregress   y x1 x2,  endogenous(       x2 = x3 x4, nomain) 
                           select( selected = x2 x5)          

. eregress   y x1 x2,  endogenous(       x2 = x3 x4, nomain) 
                          entreat(  treated = x2 x5)          

. eregress   y x1 x2,  endogenous(      x2 = x3 x5, nomain)
                          entreat( treated = x2 x3 x4)          
                           select(selected = x2 x6)

eregress fits linear regressions. You can just as easily fit a probit model as a linear regression model. If the outcome variable y is binary, type

. eprobit    y x1 x2,  endogenous(      x2 = x3 x5, nomain)
                          entreat( treated = x2 x3 x4)
                           select(selected = x2 x6)

If the outcome variable y is continuous but the variable x2 is binary, type

. eregress   y x1 x2,  endogenous(     x2 = x3 x5, binary nomain)
                         entreat( treated = x2 x3 x4) 
                          select(selected = x2 x6)

If both y and x2 are binary, type

. eprobit    y x1 x2,  endogenous(     x2 = x3 x5, binary nomain)
                         entreat( treated = x2 x3 x4) 
                          select(selected = x2 x6)

In case you are wondering about the strange nomain option, it is a detail. When you specify endogenous(name=…), variable name is added to the main equation automatically. You can type

. eregress y x1,     endogenous(x2=x3 x4)

. eregress y x1 x2,  endogenous(x2=x3 x4, nomain)

and, either way, the same model is fit. I specified nomain in the opening examples just so I would not have to explain that the option included x2 in the main equation.

See the examples at the Stata 15 ERMs page.

Command	Purpose in a CRD
power onemean, cluster	One-sample mean test
power oneproportion, cluster	One-sample proportion test
power twomeans, cluster	Two-sample means test
power twoproportions, cluster	Two-sample proportions test
power logrank, cluster	Log-rank test

Stata 15 announced, available now

Subscribe to the Stata Blog

Recent articles

Archives

Categories

Tags

Links