Home > Mata, New Books, Programming > The book that Stata programmers have been waiting for

The book that Stata programmers have been waiting for

“The book that Stata programmers have been waiting for” is how the Stata Press describes my new book on Mata, the full title of which is

The Mata Book: A Book for Serious Programmers and Those Who Want to Be

The Stata Press took its cue from me in claiming that it this the book you have been waiting for, although I was less presumptuous in the introduction:

This book is for you if you have tried to learn Mata by reading the Mata Reference Manual and failed. You are not alone. Though the manual describes the parts of Mata, it never gets around to telling you what Mata is, what is special about Mata, what you might do with Mata, or even how Mata’s parts fit together. This book does that.

I’m excited about the book, but for a while I despaired of ever completing it. I started and stopped four times. I stopped because the drafts were boring.

I puzzled over how this could be. Programming and software development are not boring to me. There’s anxiety. “How am I ever going to write that?” you think. Once you find a way, there can be tedium. “Do I have to write yet another variation on the same routine?” You don’t, but the way to completion often seems shortest if you do. Don’t give in. If you do, you’ll produce code that is difficult to maintain. Eventually, there’s giddiness when the code works, but that’s often followed by depression when you discover that it doesn’t really work, and even if it does, it’s too slow. And when you finally finish and the code produces right answers quickly enough, if you ever get there, there’s satisfaction. There are all sort of emotions along the way and I’ve experienced them all. I have been a developer long enough that I usually complete the projects I start.

My drafts were boring, I decided, because I was writing about Mata when I should have been writing about using Mata. To write about using Mata, you have to tell the story, and that means writing about algorithm design, programming, workflow, numerical accuracy, validation, and certification. So I did that.

As for the use of the word “serious” in the subtitle, one explanation for it is that you must be serious to read 428 pages, although that’s not the explanation I had in mind. “A serious programmer,” I write in the book,

is someone who has a serious interest in sharpening their programming skills and broadening their knowledge of programming tools. There is an easy test to determine whether you are serious. If I tell you that I know of a new technique for programming interrelated equations and your response is, “Tell me about it,” then you are serious. Being serious is a matter of attitude, not current skill level or knowledge.

The book may be for serious programmers, but I tried to accommodate a range of skills. At one end of the spectrum, I assumed a reader having experience at least one programming language, which could be Stata’s ado, Python, Java, C++, Fortran, or any other language you care to mention. I assumed a reader that can write programs containing conditional statements and loops. At the other end of the spectrum, I assumed a reader who cannot imagine writing code without structures and classes and who is facile with pointers to boot.

Writing for a broad audience is iffy. Early chapters have to cover the basics, and basics are dull regardless of skill level. If you are already advanced, they’re deadly. I made them interesting by choice of examples. In the section on looping statements, the example is an implementation of the Newton–Raphson method to calculate the square root of 2, implemented in one line:

: x = 1

: while (abs(x^2-2) > 1e-8) x = x - (x^2-2)/(2*x)

: x
  1.414213562

The one line is the one in the middle that iterates its way to the solution of the equation of \(x^2 = 2\). The solution to the generic problem of finding \(x\) such that \(g(x)=c\) is to define \(f(x) = g(x)-c\) and then code

: while (abs(f(x)) > 1e-8) x = x - f(x)/f'(x)

In the square-root-of-2 problem, \(f(x) = x^2-2\) and its derivative is \(f'(x) = 2*x\).

I also interspersed discussions of serious issues, such as the minimum round-off error you can theoretically achieve if you use Newton–Raphson, but with numerically calculated derivatives such as \(f'(x) = (f(x+h)-f(x))/h\). And I discuss how to specify \(h\) to achieve that theoretical limit.

The first 30% of the book is about Mata, with programming interspersed, and that programming is mostly hand waving about imagined code. The rest is about fully implemented programs, and this time it’s the details of Mata that are interspersed.

I do two other things not usually done in books like this. I write in the first person—I talk to you just as I would a new developer at StataCorp—and the projects we develop in the second part of the book do not always go well. Just as with real development, the code we write is sometimes inaccurate or its performance lousy. Discussing projects that do not go well is partly a trick to motivate subjects I wanted to talk about anyway—code encapsulation, how to time code to find performance bottlenecks, and how to develop new algorithms. The need for these solutions arises at the most inconvenient times in real life, however, and the structure of the book reflects that.

The book to me is about development, which we happen to be doing in Mata. It’s an ambitious book. I hope that it succeeds in all it that sets out to do. I can promise that it will turn you into an expert on Mata.

You can learn more about the book here.