Archive for 2010

Including covariates in crossed-effects models

The manual entry for xtmixed documents all the official features in the command, and several applications. However, it would be impossible to address all the models that can be fitted with this command in a manual entry. I want to show you how to include covariates in a crossed-effects model.

Let me start by reviewing the crossed-effects notation for xtmixed. I will use the homework dataset from Kreft and de Leeuw (1998) (a subsample from the National Education Longitudinal Study of 1988). You can download the dataset from the webpage for Rabe-Hesketh & Skrondal (2008) (, and run all the examples in this entry. Read more…

Categories: Statistics Tags: ,

How to successfully ask a question on Statalist

As everyone knows, I am a big proponent of Statalist, and not just for selfish reasons, although those reasons play a role. Nearly every member of the technical staff at StataCorp — me included — are members of Statalist. Even when we don’t participate in a particular thread, we do pay attention. The discussions on Statalist play an important role concerning Stata’s development.

Statalist is a discussion group, not just a question-and-answer forum. Nonetheless, new members often use it to obtain answers to questions and that works because those questions sometimes become gist for subsequent discussions. In those cases, the questioners not only get answers, they get much more. Read more…

Categories: Resources Tags:

New Wooldridge edition just made available

Insiders have been waiting for the second edition of Econometric Analysis of Cross Section and Panel Data by Jeffrey M. Wooldridge. I have a copy and really recommend it; later I will write a review as to why.

The book is available at the Stata bookstore and the MIT Press bookstore. It is $84 at our bookstore and $94 at MIT. The book is not yet available from Amazon.

Automating web downloads and file unzipping

Andrew J. Dyck wrote a nice post on his blog on how to Download and unzip data files from Stata. He writes

Recently, I’ve been using Stata’s -shp2dta- command to convert some shapefiles to stata format, grabbing Lat/Lon data and merging into another dataset. There were several compressed shapefiles I wanted to download contained in a directory from the web. I could manually download each file and uncompress each one but that would be time consuming. Also, when the maps are updated, I’d have to do the download/uncompress all over again. I’ve found that the process can be automated from within Stata by using a combination of -shell- and some handy terminal commands. …

You should read the rest of his post. He goes on to show how you can script with Stata to automate shelling out to download and unzip a series of files from a website, and he introduces you to some cool Unix-like utilities for Windows.

We here at StataCorp use Stata for tasks like this all the time. In fact, we have built some tools into Stata to allow you to do much of what Andrew described without ever having to leave or shell out of Stata. Read more…

Categories: Programming Tags: , , , ,

Competing risks in the Stata News

The fourth quarter Stata News came out today. Among other things, it contains an article by Bobby Gutierrez, StataCorp’s Director of Statistics, about competing risks survival analysis. If any of you are like me, conversant in survival analysis but not an expert, I think you will enjoy Bobby’s article. In a mere page and a half, I learned the primary differences between competing risks analysis and the Cox proportional hazards model and why I will sometimes prefer competing risks. Bobby’s article can be read at

2010 Italian Stata meeting recap

David Drukker and I just got back from the Italian Stata Users Group meeting in Bologna, arranged by TStat, the Stata distributor for Italy. It was wonderful, in part because of the beauty of Bologna, and the tasty food. The scientific committee and TStat did great jobs of selecting papers and organizing a smooth, interesting meeting.

The first day of the meeting had talks by users and StataCorp. There was good variety, with topics like investigating disease clustering, classification of prehistoric artifacts, small-area analysis, and the careful interpretation of marginal effects. This year, all the talks were in English — and it was once again amazing to see how well people can present in a second (or third) language. If you would like to see the slides which accompanied the talks, you can find them at

Recently, I have been thinking about how to interpret results from nonlinear models, so I found Maarten Buis’s talk on “Extracting effects from non-linear models” and David’s talk on “Estimating partial effects using margins in Stata 11” really useful. Both Maarten and David have thought carefully about this problem and each of them presented great introductions and easy to apply solutions. What is interesting is they favor different solutions. Maarten leaned more towards estimating and interpreting ratios that did not vary with the covariates. David recommended using the potential outcome framework which can be implemented using the margins command. The similarities and differences in these two talks made them even more informative.

As is typical for the Italian meetings, the second day had two training sessions, one given by David on programming your own estimation command in Stata (starting from the basics of Stata programming), and one given by Laura Antolini from the Università di Milano Bicocca on competing risks in survival analysis. Both courses were booked full.

I was a Stata user for 15 years before I started working at Stata, and the most fun parts of the meeting are the same now as when I was a user: the wishes and grumbles followed by the conference dinner. The wishes and grumbles session is always interesting; it shows the wide variety of approaches to using Stata. The conference dinner is always fun, because of the conversation over excellent food. In Italy, of course, the food is beyond excellent; strolling through Bologna on marble sidewalks under colonnades while talking statistics, programming and Stata made the evening, if in a intellectual fashion.

Mata, the missing manual, available at SSC

I gave a 1.5 hour talk on Mata at the 2010 UK Stata Users Group Meeting in September. The slides are available in pdf form here. The talk was well received, which of course pleased me. If you’re interested in Mata, I predict you will find the slides useful even if you didn’t attend the meeting. Read more…

Categories: Mata Tags: , , , ,

Stata/MP — having fun with millions

I was reviewing some timings from the Stata/MP Performance Report this morning. (For those who don’t know, Stata/MP is the version of Stata that has been programmed to take advantage of multiprocessor and multicore computers. It is functionally equivalent to the largest version of Stata, Stata/SE, and it is faster on multicore computers.)

What was unusual this morning is that I was running Stata/MP interactively. We usually run MP for large batch jobs that run thousands of timings on large datasets — either to tune performance or to produce reports like the Performance Report. That is the type of work Stata/MP was designed for — big jobs on big datasets. Read more…

Connection string support added to odbc command

Stata’s odbc command allows you to import data from and export data to any ODBC data source on your computer. ODBC is a standardized way for applications to read data from and write data to different data sources such as databases and spreadsheets.

Until now, before you could use the odbc command, you had to add a named data source (DSN) to the computer via the ODBC Data Source Administrator. If you did not have administrator privileges on your computer, you could not do this. Read more…


I just want to take a moment to plug Statalist. I’m a member and I hope to convince you to join Statalist, too, but even if I don’t succeed, you need to know about the web-based Statalist Archives because they’re a great resource for finding answers to questions about Stata, and you don’t have to join Statalist to access them.

Statalist’s Archives are found at, or you can click on “Statalist archives” on the right of this blog page, under Links.

Once at the Archives page, you can click on a year and month to get an idea of the flavor of Statalist. More importantly, you can search the archives. The search is Powered by Google and works well for highly specific, directed inquiries. For generic searches such as random numbers or survival analysis, however, I prefer to go to Advanced Search and ask that the results be sorted by date instead of relevance. It’s usually the most recent postings that are the most interesting, and by-date results are listed in just that order.

Anyway, the next time you are puzzling over something in Stata, I suggest that Read more…

Categories: Resources Tags: