Advanced Mata: Pointers

I’m still recycling my talk called “Mata, The Missing Manual” at user meetings, a talk designed to make Mata more approachable. One of the things I say late in the talk is, “Unless you already know what pointers are and know you need them, ignore them. You don’t need them.” And here I am writing about, of all things, pointers. Well, I exaggerated a little in my talk, but just a little.

Before you take my previous advice and stop reading, let me explain: Mata serves a number of purposes and one of them is as the primary langugage we at StataCorp use to implement new features in Stata. I’m not referring to mock ups, toys, and experiments, I’m talking about ready-to-ship code. Stata 12’s Structural Equation Modeling features are written in Mata, so is Multiple Imputation, so is Stata’s optimizer that is used by nearly all estimation commands, and so are most features. Mata has a side to it that is exceedingly serious and intended for use by serious developers, and every one of those features are available to users just as they are to StataCorp developers. This is one of the reasons there are so many user-written commands are available for Stata. Even if you don’t use the serious features, you benefit. Read more…

Categories: Mata Tags: , ,

Use poisson rather than regress; tell a friend

Do you ever fit regressions of the form

ln(yj) = b0 + b1x1j + b2x2j + … + bkxkj + εj

by typing

. generate lny = ln(y)

. regress lny x1 x2 … xk

The above is just an ordinary linear regression except that ln(y) appears on the left-hand side in place of y. Read more…

2011 Stata Conference recap

The 2011 Stata Conference in Chicago ended last Friday, and a good time was had by all.

The two days had the usual wide array of talks, given by researchers in Econometrics, Sociology, Medicine, and Statistics, together with three of us from StataCorp—Bill Gould, David Drukker, and me.

The conference was held in the Gleacher center on the banks of the Chicago River in Chicago (of course), which is a fine facility. I know it sounds mundane, but the acoustics in the lecture hall were excellent, making it very easy for speakers and questions to be heard clearly. Read more…

Categories: Meetings Tags: ,

Stata 12 Announced

We are pleased to announce a new version of Stata: Stata 12. You can order it today, it starts shipping on July 25, and you can find out about it at www.stata.com/stata12/.

Here are the highlights of what’s new: Read more…

Precision (yet again), Part II

In part I, I wrote about precision issues in English. If you enjoyed that, you may want to stop reading now, because I’m about to go into the technical details. Actually, these details are pretty interesting.

For instance, I offered the following formula for calculating error due to float precision: Read more…

Categories: Numerical Analysis Tags: ,

Precision (yet again), Part I

I wrote about precision here and here, but they were pretty technical.

“Great,” coworkers inside StataCorp said to me, “but couldn’t you explain these issues in a way that doesn’t get lost in the details of how computers store binary and maybe, just maybe, write about floats and doubles from a user’s perspective instead of programmer’s perspective?”

“Mmmm,” I said clearly.

Later, when I tried, I liked the result. It contains new material, too. What follows is what I now wish I had written first. I’d would have still written the other two postings, but as technical appendices. Read more…

Categories: Numerical Analysis Tags: ,

Stata at JSM 2011 in Miami Beach, FL

StataCorp invites you to stop by our booth, 404, at JSM 2011, July 31 – August 3, in Miami Beach, FL. StataCorp staff and developers will be on hand to answer any questions you have about Stata, from statistics to programming to licensing. You can also register to win a copy of quad-core Stata/MP.

StataCorp is also presenting three continuing education technology workshops at JSM 2011: Read more…

Categories: Meetings Tags: , ,

Merging data, part 2: Multiple-key merges

Multiple-key merges arise when more than one variable is required to uniquely identify the observations in your data. In Merging data, part 1, I discussed single-key merges such as

        . merge 1:1 personid using ...

In that discussion, each observation in the dataset could be uniquely identified on the basis of a single variable. In panel or longitudinal datasets, there are multiple observations on each person or thing and to uniquely identify the observations, we need at least two key variables, such as Read more…

Categories: Data Management Tags: ,

Stata Conferences and Meetings Update

Between now and the end of the year, the annual Stata Conference in the United States will take place along with five other Stata meetings in countries around the world.

Stata conferences and meetings feature talks by both Stata users and Stata developers and provide an opportunity to help shape the future of Stata development by interacting with and providing feedback directly to StataCorp personnel.

The talks range from longer presentations by invited speakers to shorter talks demonstrating the use of Stata in a variety of fields. Some talks are statistical in nature while others focus on data management, graphics, or programming in Stata. New enhancements to Stata created both by users and by StataCorp are often featured in talks.

The full schedule of upcoming meetings is Read more…

Categories: Meetings Tags: ,

Merging data, part 1: Merges gone bad

Merging concerns combining datasets on the same observations to produce a result with more variables. We will call the datasets one.dta and two.dta.

When it comes to combining datasets, the alternative to merging is appending, which is combining datasets on the same variables to produce a result with more observations. Appending datasets is not the subject for today. But just to fix ideas, appending looks like this: Read more…

Categories: Data Management Tags: ,