In my previous posting last week, I explained how computers store binary floating-point numbers, how Stata’s %21x display format displays with fidelity those binary floating-point numbers, how %21x can help you uncover bugs, and how %21x can help you understand behaviors that are not bugs even though they are surpising to us base-10 thinkers. The point is, it is sometimes useful to think in binary, and with %21x, thinking in binary is not difficult.

This week, I want to discuss double versus float precision. Read more…

%21x is a Stata display format, just as are %f, %g, %9.2f, %td, and so on. You could put %21x on any variable in your dataset, but that is not its purpose. Rather, %21x is for use with Stata’s **display** command for those wanting to better understand the accuracy of the calculations they make. We use %21x frequently in developing Stata. Read more…

Excuse me, but I’m going to toot Stata’s horn.

I got an email from Nicholas Cox (an Editor of the Stata Journal) yesterday. He said he was writing something for the Stata Journal and wanted the details on how we calculated a^b. He was focusing on examples such as (-8)^(1/3), where Stata produces a missing value rather than -2, and he wanted to know if our calculation of that was exp((1/3)*ln(-8)). He didn’t say where he was going, but I answered his question.

I have rather a lot to say about this.

Nick’s supposition was correct, in this particular case, and for most values of a and b, Stata calculates a^b as exp(b*ln(a)). In the case of a=-8 and b=1/3, ln(-8)==., and thus (-8)^(1/3)==.. Read more…

Most software stores dates and times numerically, as durations from some sentinel date, but they differ on the sentinel date and on the units in which the duration is stored. Stata stores dates as the number of days since 01jan1960, and datetimes as the number of milliseconds since 01jan1960 00:00:00.000. January 3, 2011 is stored as 18,630, and 2pm on January 3 is stored as 1,609,682,400,000. Other packages use different choices for bases and units.

It sometimes happens that you need to process in Stata data imported from other software and end up with a numerical variable recording a date or datetime in the other software’s encoding. It is usually possible to adjust the numeric date or datetime values to the sentinel date and units that Stata uses. Below are conversion rules for SAS, SPSS, R, Excel, and Open Office. Read more…

As everyone knows, I am a big proponent of Statalist, and not just for selfish reasons, although those reasons play a role. Nearly every member of the technical staff at StataCorp — me included — are members of Statalist. Even when we don’t participate in a particular thread, we do pay attention. The discussions on Statalist play an important role concerning Stata’s development.

Statalist is a discussion group, not just a question-and-answer forum. Nonetheless, new members often use it to obtain answers to questions and that works because those questions sometimes become gist for subsequent discussions. In those cases, the questioners not only get answers, they get much more. Read more…

I gave a 1.5 hour talk on Mata at the 2010 UK Stata Users Group Meeting in September. The slides are available in pdf form here. The talk was well received, which of course pleased me. If you’re interested in Mata, I predict you will find the slides useful even if you didn’t attend the meeting. Read more…

I just want to take a moment to plug Statalist. I’m a member and I hope to convince you to join Statalist, too, but even if I don’t succeed, you need to know about the web-based Statalist Archives because they’re a great resource for finding answers to questions about Stata, and you don’t have to join Statalist to access them.

Statalist’s Archives are found at http://www.stata.com/statalist/archive/, or you can click on “Statalist archives” on the right of this blog page, under Links.

Once at the Archives page, you can click on a year and month to get an idea of the flavor of Statalist. More importantly, you can search the archives. The search is Powered by Google and works well for highly specific, directed inquiries. For generic searches such as random numbers or survival analysis, however, I prefer to go to Advanced Search and ask that the results be sorted by date instead of relevance. It’s usually the most recent postings that are the most interesting, and by-date results are listed in just that order.

Anyway, the next time you are puzzling over something in Stata, I suggest that Read more…

When Stata first started back in 1985, communicating with users–well, back then they were potential users because we didn’t have any users yet–was nearly impossible.

From the beginning, we were very modern. Back in 1985, there were competing packages, but no one (not even me) expected personal computers to replace the mainframe. Back then, about the best that could be said about the available statistical packages is that they worked (sometimes) for some problems. What made Stata different was our belief and attitude that personal computers could actually be better than the mainframe for some problems. That in itself was a radical idea! In the mainstream, mainframe computer world, there was a popular saying: Little computers for little minds.

And we’ve stayed modern since then. Stata was (in 1999) the first statistical package to have online updating and an automated, modern, Internet way to handle user-written code. Modern Statas not only have that, but can use datasets directly off the web. But we have fallen behind! It’s 2010, and StataCorp doesn’t have a corporate blog!

Well, we do now.

Well, that may not be the most exciting announcement we’ve ever made. But our blog will be authored by the same people who develop Stata, support Stata, and yes, sell Stata. It will be useful, and it might be more entertaining than you suspect. If it is, that will be because of the people writing it.