Archive for 2017

Nonparametric regression: Like parametric regression, but not

Initial thoughts

Nonparametric regression is similar to linear regression, Poisson regression, and logit or probit regression; it predicts a mean of an outcome for a set of covariates. If you work with the parametric models mentioned above or other models that predict means, you already understand nonparametric regression and can work with it.

The main difference between parametric and nonparametric models is the assumptions about the functional form of the mean conditional on the covariates. Parametric models assume the mean is a known function of \(\mathbf{x}\beta\). Nonparametric regression makes no assumptions about the functional form.

In practice, this means that nonparametric regression yields consistent estimates of the mean function that are robust to functional form misspecification. But we do not need to stop there. With npregress, introduced in Stata 15, we may obtain estimates of how the mean changes when we change discrete or continuous covariates, and we can use margins to answer other questions about the mean function.

Below I illustrate how to use npregress and how to interpret its results. As you will see, the results are interpreted in the same way you would interpret the results of a parametric model using margins. Read more…

Stata 15 announced, available now

We announced Stata 15 today. It’s a big deal because this is Stata’s biggest release ever.

I posted to Statalist this morning and listed sixteen of the most important new features. Here on the blog I will say more about them, and you can learn even more by visiting our website and seeing the Stata 15 features page.

I go into depth below on the sixteen highlighted features. They are (click to jump)

Read more…

Creating Excel tables with putexcel part 3: Writing custom reports for arbitrary variables

In my last post, I demonstrated how to use putexcel to recreate common Stata output in Microsoft Excel. Today I want to show you how to create custom reports for arbitrary variables. I am going to create tables that combine cell counts with row percentages, and means with standard deviations. But you could modify the examples below to include column percentages, percentiles, standard errors, confidence intervals or any statistic. I am also going to pass the variable names into my programs using local macros. This will allow me to create the same report for arbitrary variables by simply assigning new variable names to the macros. You could extend this idea by creating a do-file for each report and passing the variable names into the do-files. This is another important step toward our goal of automating the creation of reports in Excel.

Today’s blog post is Read more…

Categories: Programming Tags: ,

Estimation under omitted confounders, endogeneity, omitted variable bias, and related problems

Initial thoughts

Estimating causal relationships from data is one of the fundamental endeavors of researchers, but causality is elusive. In the presence of omitted confounders, endogeneity, omitted variables, or a misspecified model, estimates of predicted values and effects of interest are inconsistent; causality is obscured.

A controlled experiment to estimate causal relations is an alternative. Yet conducting a controlled experiment may be infeasible. Policy makers cannot randomize taxation, for example. In the absence of experimental data, an option is to use instrumental variables or a control function approach.

Stata has many built-in estimators to implement these potential solutions and tools to construct estimators for situations that are not covered by built-in estimators. Below I illustrate both possibilities for a linear model and, in a later post, will talk about nonlinear models. Read more…

Creating Excel tables with putexcel, part 2: Macro, picture, matrix, and formula expressions

In my last post, I showed how to use putexcel to write simple expressions to Microsoft Excel and format the resulting text and cells. Today, I want to show you how to write more complex expressions such as macros, graphs, and matrices. I will even show you how to write formulas to Excel to create calculated cells. These are important steps toward our goal of automating the creation of reports in Excel.

Before we begin the examples, Read more…

Categories: Programming Tags: ,

Creating Excel tables with putexcel, part 1: Introduction and formatting

For a long time, I have wanted to type a Stata command like this,

. ExcelTable race, cont(age height weight) cat(sex diabetes)
The Excel table table.xlsx was created successfully

and get an Excel table that looks like this:


So I wrote a program called ExcelTable for my own use Read more…

Categories: Programming Tags: ,