Archive

Archive for February 2011

We’re turning on comments next week

25 February 2011 No comments

Our target date for tuning on comments is Wednesday of next week. Don’t hold us to it, but we’re optimistic on meeting our deadline.

In the meantime, Bill Gould is so excited he’s put on hold his latest blog entry. It’s about linear algebra — matrices in particular — so it won’t exactly go out of date, and he’s looking forward to seeing the comments. Bill’s very proud of what he’s written, so be gentle. We’ll put up his posting right after comments are enabled.

Categories: Company Tags:

Positive log-likelihood values happen

From time to time, we get a question from a user puzzled about getting a positive log likelihood for a certain estimation. We get so used to seeing negative log-likelihood values all the time that we may wonder what caused them to be positive.

First, let me point out that there is nothing wrong with a positive log likelihood.

The likelihood is the product of the density evaluated at the observations. Usually, the density takes values that are smaller than one, so its logarithm will be negative. However, this is not true for every distribution.

For example, let’s think of the density of a normal distribution with a small standard deviation, let’s say 0.1.

. di normalden(0,0,.1)
3.9894228

This density will concentrate a large area around zero, and therefore will take large values around this point. Naturally, the logarithm of this value will be positive.

. di log(3.9894228)
1.3836466

In model estimation, the situation is a bit more complex. When you fit a model to a dataset, the log likelihood will be evaluated at every observation. Some of these evaluations may turn out to be positive, and some may turn out to be negative. The sum of all of them is reported. Let me show you an example.

I will start by simulating a dataset appropriate for a linear model.

clear
program drop _all
set seed 1357
set obs 100
gen x1 = rnormal()
gen x2 = rnormal()
gen y = 2*x1 + 3*x2 +1 + .06*rnormal()

I will borrow the code for mynormal_lf from the book Maximum Likelihood Estimation with Stata (W. Gould, J. Pitblado, and B. Poi, 2010, Stata Press) in order to fit my model via maximum likelihood.

program mynormal_lf
        version 11.1
        args lnf mu lnsigma
        quietly replace `lnf' = ln(normalden($ML_y1,`mu',exp(`lnsigma')))
end

ml model lf  mynormal_lf  (y = x1 x2) (lnsigma:)
ml max, nolog

The following table will be displayed:

.   ml max, nolog

                                                  Number of obs   =        100
                                                  Wald chi2(2)    =  456919.97
Log likelihood =  152.37127                       Prob > chi2     =     0.0000

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
eq1          |   
          x1 |   1.995834    .005117   390.04   0.000     1.985805    2.005863
          x2 |   3.014579   .0059332   508.08   0.000      3.00295    3.026208
       _cons |   .9990202   .0052961   188.63   0.000       .98864      1.0094
-------------+----------------------------------------------------------------
lnsigma      |  
       _cons |  -2.942651   .0707107   -41.62   0.000    -3.081242   -2.804061
------------------------------------------------------------------------------

We can see that the estimates are close enough to our original parameters, and also that the log likelihood is positive.

We can obtain the log likelihood for each observation by substituting the estimates in the log-likelihood formula:

. predict double xb

. gen double lnf = ln(normalden(y, xb, exp([lnsigma]_b[_cons])))

. summ lnf, detail

                             lnf
-------------------------------------------------------------
      Percentiles      Smallest
 1%    -1.360689      -1.574499
 5%    -.0729971       -1.14688
10%     .4198644      -.3653152       Obs                 100
25%     1.327405      -.2917259       Sum of Wgt.         100

50%     1.868804                      Mean           1.523713
                        Largest       Std. Dev.      .7287953
75%     1.995713       2.023528
90%     2.016385       2.023544       Variance       .5311426
95%     2.021751       2.023676       Skewness      -2.035996
99%     2.023691       2.023706       Kurtosis       7.114586

. di r(sum)
152.37127

. gen f = exp(lnf)

. summ f, detail

                              f
-------------------------------------------------------------
      Percentiles      Smallest
 1%     .2623688       .2071112
 5%     .9296673       .3176263
10%      1.52623       .6939778       Obs                 100
25%     3.771652       .7469733       Sum of Wgt.         100

50%     6.480548                      Mean           5.448205
                        Largest       Std. Dev.      2.266741
75%     7.357449       7.564968
90%      7.51112        7.56509       Variance       5.138117
95%     7.551539       7.566087       Skewness      -.8968159
99%     7.566199        7.56631       Kurtosis       2.431257

We can see that some values for the log likelihood are negative, but most are positive, and that the sum is the value we already know. In the same way, most of the values of the likelihood are greater than one.

As an exercise, try the commands above with a bigger variance, say, 1. Now the density will be flatter, and there will be no values greater than one.

In short, if you have a positive log likelihood, there is nothing wrong with that, but if you check your dispersion parameters, you will find they are small.

Categories: Statistics Tags:

How to read the %21x format, part 2

In my previous posting last week, I explained how computers store binary floating-point numbers, how Stata’s %21x display format displays with fidelity those binary floating-point numbers, how %21x can help you uncover bugs, and how %21x can help you understand behaviors that are not bugs even though they are surpising to us base-10 thinkers. The point is, it is sometimes useful to think in binary, and with %21x, thinking in binary is not difficult.

This week, I want to discuss double versus float precision.

Double (8-byte) precision provides 53 binary digits. Float (4-byte) precision provides 24. Let me show you what float precision looks like.

. display %21x sqrt(2) _newline %21x float(sqrt(2))
+1.6a09e667f3bcdX+000
+1.6a09e60000000X+000

All those zeros in the floating-point result are not really there;
%21x merely padded them on. The display would be more honest if it were

+1.6a09e6       X+000

Of course, +1.6a09e60000000X+000 is a perfectly valid way of writing +1.6a09e6X+000 — just as 1.000 is a valid way of writing 1 — but you must remember that float has fewer digits than double.

Hexadecimal 1.6109e6 is a rounded version of 1.6a09e667f3bcd, and you can think of this in one of two ways:

     double     =  float   + extra precision
1.6a09e667f3bcd = 1.6a09e6 + 0.00000067f3bcd

or

  float   =      double     -  lost precision
1.6a09e6  = 1.6a09e667f3bcd - 0.00000067f3bcd

Note that more digits are lost than appear in the float result! The float result provides six hexadecimal digits (ignoring the 1), and seven digits appear under the heading lost precision. Double precision is more than twice float precision. To be accurate, double precision provides 53 binary digits, float provides 24, so double precision is really 53/24 = 2.208333 precision.

The double of double precision refers to the total number of binary digits used to store the mantissa and the exponent in z=a*2^b, which is 64 versus 32. Precision is 53 versus 24.

In this case, we obtained the floating-point from float(sqrt(2)), meaning that we rounded a more accurate double-precision result. One usually rounds when producing a less precise representation. One of the rounding rules is to round up if the digits being omitted (with a decimal point in front) exceed 1/2, meaning 0.5 in decimal. The equivalent rule in base-16 is to round up if the digits being omitted (with a hexadecimal point in front) exceed 1/2, meaning 0.8 (base-16). The lost digits were .67f3bcd, which are less than 0.8, and therefore, the last digit of the rounded result was not adjusted.

Actually, rounding to float precision is more difficult than I make out, and seeing that numbers are rounded correctly when displayed in %21x can be difficult. These difficulties have to do with the relationship between base-2 — the base in which the computer works — and base-16 — a base similar but not identical to base-2 that we humans find more readable. The fact is that %21x was designed for double precision, so it only does an adequate job of showing single precision. When %21x displays a float-precision number, it shows you the exactly equal double-precision number, and that turns out to matter.

We use base-16 because it is easier to read. But why do we use base-16 and not base-15 or base-17? We use base-16 because it is an integer power of 2, the base the computer uses. One advantage of bases being powers of each other is that base conversion can be done more easily. In fact, conversion can be done almost digit by digit. Doing base conversion is usually a tedious process. Try converting 2394 (base-10) to base-11. Well, you say, 11^3=1331, and 2*11331 = 2662 > 2394, so the first digit is 1 and the remainder is 2394-1331 = 1063. Now, repeating the process with 1063, I observe that 11^2 = 121 and that 1063 is bound by 8*121=969 and 9*121=1089, so the second digit is 9 and I have a remainder of …. And eventually you produce the answer 1887 (base-11).

Converting between bases when one is the power of another not only is easier but also is so easy you can do it in your head. To convert from base-2 to base-16, group the binary digits into groups of four (because 2^4=16) and then translate each group individually.

For instance, to convert 011110100010, proceed as follows:

0111 1010 0010
--------------
   7    a    2

I’ve performed this process often enough that I hardly have to think. But here is how you should think: Divide the binary number into four-digit groups. The four columns of the binary number stand for 8, 4, 2, and 1. When you look at 0111, say to yourself 4+2+1 = 7. When you look at 1010, say to yourself 8+2 = 10, and remember that the digit for 10 in base-16 is a.

Converting back is nearly as easy:

   7    a    2
--------------
0111 1010 0010

Look at 7 and remember the binary columns 8-4-2-1. Though 7 does not contain an 8, it does contain a 4 (leaving 3), and 3 contains a 2 and a 1.

I admit that converting base-16 to base-2 is more tedious than converting base-2 to base-16, but eventually, you’ll have the four-digit binary table memorized; there are only 16 lines. Say 7 to me, and 0111 just pops into my head. Well, I’ve been doing this a long time, and anyway, I’m a geek. I suspect I carry the as-yet-undiscovered binary gene, which means I came into this world with the base-2-to-base-16 conversion table hardwired:

base-2 base-16
0000 0
0001 1
0010 2
0011 3
0100 4
1001 9
1010 a
1111 f

Now that you can convert base-2 to base-16 — convert from binary to hexadecimal — and you can convert back again, let’s return to floating-point numbers.

Remember how floating-point numbers are stored:

z = a * 2^b, 1<=a<2 or a==0

For example,

    0.0 = 0.0000000000000000000000000000000000000000000000000000 * 2^-big
    0.5 = 1.0000000000000000000000000000000000000000000000000000 * 2^-1
    1.0 = 1.0000000000000000000000000000000000000000000000000000 * 2^0
sqrt(2) = 1.0110101000001001111001100110011111110011101111001101 * 2^0
    1.5 = 1.1000000000000000000000000000000000000000000000000000 * 2^0
    2.0 = 1.0000000000000000000000000000000000000000000000000000 * 2^0
    2.5 = 1.0100000000000000000000000000000000000000000000000000 * 2^0
    3.0 = 1.1000000000000000000000000000000000000000000000000000 * 2^1
    _pi = 1.1001001000011111101101010100010001000010110100011000 * 2^1
    etc.

In double precision, there are 53 binary digits of precision. One of the digits is written to the left of binary point, and the remaining 52 are written to the right. Next observe that the 52 binary digits to the right of the binary point can be written in 52/4=13 hexadecimal digits. That is exactly what %21x does:

    0.0 = +0.0000000000000X-3ff
    0.5 = +1.0000000000000X-001
    1.0 = +1.0000000000000X+000
sqrt(2) = +1.6a09e667f3bcdX+000
    1.0 = +1.0000000000000X+000
    1.5 = +1.8000000000000X+000
    2.0 = +1.0000000000000X+001
    2.5 = +1.4000000000000X+001
    3.0 = +1.8000000000000X+002
    _pi = +1.921fb54442d18X+001

You could perform the binary-to-hexadecimal translation for yourself. Consider _pi. The first group of four binary digits after the binary point are 1001, and 9 appears after the binary point in the %21x result. The second group of four are 0010, and 2 appears in the %21x result.
The %21x result is an exact representation of the underlying binary, and thus you are equally entitled to think in either base.

In single precision, the rule is the same:

z = a * 2^b, 1<=a<2 or a==0

But this time, only 24 binary digits are provided for a, and so we have

    0.0 = 0.00000000000000000000000 * 2^-big
    0.5 = 1.00000000000000000000000 * 2^-1
    1.0 = 1.00000000000000000000000 * 2^0
sqrt(2) = 1.01101010000010011110011 * 2^0
    1.5 = 1.10000000000000000000000 * 2^0
    2.0 = 1.00000000000000000000000 * 2^0 
    2.5 = 1.01000000000000000000000 * 2^0
    3.0 = 1.10000000000000000000000 * 2^1
    _pi = 1.10010010000111111011011 * 2^1
    etc.

In single precision, there are 24-1=23 binary digits of precision to the right of the binary point, and 23 is not divisible by 4. If we tried to convert to base-16, we end up with

sqrt(2) = 1.0110 1010 0000 1001 1110 011   * 2^0
          1.   6    a    0    9    e    ?  * 2^0

To fill in the last digit, we could recognize that we can pad on an extra 0 because we are to the right of the binary point. For example, 1.101 == 1.1010. If we padded on the extra 0, we have

sqrt(2) = 1.0110 1010 0000 1001 1110 0110  * 2^0
          1.   6    a    0    9    e    6  * 2^0

That is precisely the result %21x shows us:

. display %21x float(sqrt(2))
+1.6a09e60000000X+000

although we might wish that %21x would omit the 0s that aren’t really there, and instead display this as +1.6a09e6X+000.

The problem with this solution is that it can be misleading because the last digit looks like it contains four binary digits when in fact it contains only three. To show how easily you can be misled, look at _pi in double and float precisions:

. display %21x _pi _newline %21x float(_pi)
+1.921fb54442d18X+001
+1.921fb60000000X+001
        ^
  digit incorrectly rounded?

The computer rounded the last digit up from 5 to 6. The digits after the rounded-up digit in the full-precision result, however, are 0.4442d18, and are clearly less than 0.8 (1/2). Shouldn’t the rounded result be 1.921fb5X+001? The answer is that yes, 1.921fb5X+001 would be a better result if we had 6*4=24 binary digits to the right of the binary point. But we have only 23 digits; correctly rounding to 23 binary digits and then translating into base-16 results in 1.921fb6X+001. Because of the missing binary digit, the last base-16 digit can only take on the values 0, 2, 4, 6, 8, a, c, and e.

The computer performs the rounding in binary. Look at the relevant piece of this double-precision number in binary:

+1.921f   b    5    4    4    42d18X+001      number
       1011 0101 0100 0100 0100               expansion into binary
       1011 01?x xxxx xxxx xxxxxxxx           thinking about rounding
       1011 011x xxxx xxxx xxxxxxxx           performing rounding
+1.921f   b    6                   X+001      convert to base-16

The part I have converted to binary in the second line is around the part to be rounded. In the third line, I’ve put x’s under the part we will have to discard to round this double into a float. The x’d out part — 10100… — is clearly greater than 1/2, so the last digit (where I put a question mark) must be rounded up. Thus, _pi in float precision rounds to 1.921fb6+X001, just as the computer said.

Float precision does not play much of a role in Stata despite the fact that most users store their data as floats. Regardless of how data are stored, Stata makes all calculations in double precision, and float provides more than enough precision for most data applications. The U.S. deficit in 2011 is projected to be $1.5 trillion. One hopes that a grand total of $26,624 — the error that would be introduced by storing this projected deficit in float precision — would not be a significant factor in any lawmaker’s decision concerning the issue. People in the U.S. are said to work about 40 hours per week, or roughly 0.238 of the hours in a week. I doubt that number is accurate to 0.4 milliseconds, the error that float would introduce in recording the fraction. A cancer survivor might live 350.1 days after a treatment, but we would introduce an error of roughly 1/2 second if we record the number as a float. One might question whether the instant of death could even conceptually be determined that accurately. The moon is said to be 384.401 thousand kilometers from the Earth. Record in 1,000s of kilometers in float, and the error is almost 1 meter. At its closest and farthest, the moon is 356,400 and 406,700 kilometers away. Most fundamental constants of the universe are known only to a few parts in a million, which is to say, to less than float precision, although we do know the speed of light in a vacuum to one decimal digit beyond float accuracy; it’s 299,793.458 kilometers per second. Round that to float and you’ll be off by 0.01 km/s.

The largest integer that can be recorded without rounding in float precision is 16,777,215. The largest integer that can be recorded without rounding in double precision is 9,007,199,254,740,991.

People working with dollar-and-cent data in Stata usually find it best to use doubles both to avoid rounding issues and in case the total exceeds $167,772.15. Rounding issues of 0.01, 0.02, etc., are inherent when working with binary floating point, regardless of precision. To avoid all problems, these people should use doubles and record amounts in pennies. That will have no difficulty with sums up to $90,071,992,547,409.91, which is to say, about $90 trillion. That’s nine quadrillion pennies. In my childhood, I thought a quadrillion just meant a lot, but it has a formal definition.

All of which is a long way from where I started, but now you are an expert in understanding binary floating-point numbers the way a scientific programmer needs to understand them: z=a*2^b. You are nearly all the way to understanding the IEEE 754-2008 standard. That standard merely states how a and b are packed into 32 and 64 bits, and the entire point of %21x is to avoid those details because, packed together, the numbers are unreadable by humans.

References

Cox, N. J. 2006. Tip 33: Sweet sixteen: Hexadecimal formats and precision problems. Stata Journal 6: 282-283.

Gould, William. 2006. Mata matters: Precision. Stata Journal 6: 550-560.

Linhart, J. M. 2008. Mata matters: Overflow and IEEE floating-point format. Stata Journal 8: 255-268.

How to read the %21x format

%21x is a Stata display format, just as are %f, %g, %9.2f, %td, and so on. You could put %21x on any variable in your dataset, but that is not its purpose. Rather, %21x is for use with Stata’s display command for those wanting to better understand the accuracy of the calculations they make. We use %21x frequently in developing Stata.

%21x produces output that looks like this:

. display %21x 1
+1.0000000000000X+000

. display %21x 2
+1.0000000000000X+001

. display %21x 10
+1.4000000000000X+003

. display %21x sqrt(2)
+1.6a09e667f3bcdX+000

All right, I admit that the result is pretty unreadable to the uninitiated. The purpose of %21x is to show floating-point numbers exactly as the computer stores them and thinks about them. In %21x’s defense, it is more readable than how the computer really records floating-point numbers, yet it loses none of the mathematical essence. Computers really record floating-point numbers like this:

      1 = 3ff0000000000000
      2 = 4000000000000000
     10 = 4024000000000000
sqrt(2) = 3ff6a09e667f3bcd

Or more correctly, they record floating-point numbers in binary, like this:

      1 = 0011111111110000000000000000000000000000000000000000000000000000
      2 = 0100000000000000000000000000000000000000000000000000000000000000
     10 = 0100000000100100000000000000000000000000000000000000000000000000
sqrt(2) = 0011111111110110101000001001111001100110011111110011101111001101

By comparison, %21x is a model of clarity.

The above numbers are 8-byte floating point, also known as double precision, encoded in binary64 IEEE 754-2008 little endian format. Little endian means that the bytes are ordered, left to right, from least significant to most significant. Some computers store floating-point numbers in big endian format — bytes ordered from most significant to least significant — and then numbers look like this:

      1 = 000000000000f03f
      2 = 0000000000000040
     10 = 0000000000004024
sqrt(2) = cd3b7f669ea0f63f

or:

      1 = 0000000000000000000000000000000000000000000000001111000000111111
      2 = 0000000000000000000000000000000000000000000000000000000000000100
     10 = 0000000000000000000000000000000000000000000000000100000000100100
sqrt(2) = 1100110100111011011111110110011010011110000011111111011000111111

Regardless of that, %21x produces the same output:

. display %21x 1
+1.0000000000000X+000

. display %21x 2
+1.0000000000000X+001

. display %21x 10
+1.4000000000000X+003

. display %21x sqrt(2)
+1.6a09e667f3bcdX+000

Binary computers store floating-point numbers as a number pair, (a, b); the desired number z is encoded

z = a * 2^b

For example,

 1 = 1.00 * 2^0
 2 = 1.00 * 2^1
10 = 1.25 * 2^3

The number pairs are encrypted in the bit patterns, such as 00111111…01, above.

I’ve written the components a and b in decimal, but for reasons that will become clear, we need to preserve the essential binaryness of the computer’s number. We could write the numbers in binary, but they will be more readable if we represent them in base-16:

base-10 base-16
floating point
1 = 1.00 * 2^0
2 = 1.00 * 2^1
10 = 1.40 * 2^3

“1.40?”, you ask, looking at the last row, which indicates 1.40*2^3 for decimal 10.

The period in 1.40 is not a decimal point; it is a hexadecimal point. The first digit after the hexadecimal point is the number for 1/16ths, the next is for 1/(16^2)=1/256ths, and so on. Thus, 1.40 hexadecimal equals 1 + 4*(1/16) + 0*(1/256) = 1.25 in decimal.

And that is how you read %21x values +1.0000000000000X+000, +1.0000000000000X+001, and +1.4000000000000X+003. To wit,

base-10 base-16
floating point
%21x
1 = 1.00 * 2^0 = +1.0000000000000X+000
2 = 1.00 * 2^1 = +1.0000000000000X+001
10 = 1.40 * 2^3 = +1.4000000000000X+003

The mantissa is shown to the left of the X, and, to the right of the X, the exponent for the 2. %21x is nothing more than a binary variation of the %e format with which we are all familiar, for example, 12 = 1.20000e+01 = 1.2*10^1. It’s such an obvious generalization that one would guess it has existed for a long time, so excuse me when I mention that we invented it at StataCorp. If I weren’t so humble, I would emphasize that this human-readable way of representing binary floating-point numbers preserves nearly every aspect of the IEEE floating-point number. Being humble, I will merely observe that 1.40x+003 is more readable than 4024000000000000.

Now that you know how to read %21x, let me show you how you might use it. %21x is particularly useful for examining precision issues.

For instance, the cube root of 8 is 2; 2*2*2 = 8. And yet, in Stata, 8^(1/3) is not equal to 2:

. display 8^(1/3)2

. assert 8^(1/3) == 2
assertion is false
r(9) ; 

. display %20.0g 8^(1/3)
1.99999999999999978

I blogged about that previously; see How Stata calculates powers. The error is not much:

. display 8^(1/3)-2-2.220e-16

In %21x format, however, we can see that the error is only one bit:

. display %21x 8^(1/3)
+1.fffffffffffffX+000

. display %21x 2    
+1.0000000000000X+001

I wish the answer for 8^(1/3) had been +1.0000000000001X+000, because then the one-bit error would have been obvious to you. Instead, rather than being a bit too large, the actual answer is a bit too small — one bit too small to be exact — so we end up with +1.fffffffffffffX+000.

One bit off means being off by 2^(-52), which is 2.220e-16, and which is the number we saw when we displayed in base-10 8^(1/3)-2. So %21x did not reveal anything we could not have figured out in other ways. The nature of the error, however, is more obvious in %21x format than it is in a base-10 format.

On Statalist, the point often comes up that 0.1, 0.2, …, 0.4, 0.6, …, 0.9, 0.11, 0.12, … have no exact representation in the binary base that computers use. That becomes obvious with %21x format:

. display %21x 0.1
+1.999999999999aX-004 . display %21x 0.2 +1.999999999999aX-003. ...

0.5 does have an exact representation, of course, as do all the negative powers of 2:

. display %21x 0.5           //  1/2
+1.0000000000000X-001

. display %21x 0.25          //  1/4
+1.0000000000000X-002

. display %21x 0.125         //  1/8
+1.0000000000000X-003

. display %21x 0.0625        // 1/16
+1.0000000000000X-004

. ...

Integers have exact representations, too:

. display %21x 1
+1.0000000000000X+000

. display %21x 2
+1.0000000000000X+001

. display %21x 3
+1.8000000000000X+001

. ...

. display %21x 10
+1.4000000000000X+003

. ...

. display %21x 10786204
+1.492b380000000X+017

. ...

%21x is a great way of becoming familiar with base-16 (equivalently, base-2), which is worth doing if you program base-16 (equivalently, base-2) computers.

Let me show you something useful that can be done with %21x.

A programmer at StataCorp has implemented a new statistical command. In four examples, the program produces the following results:

41.8479499816895
 6.7744922637939
 0.1928647905588
 1.6006311178207

Without any additional information, I can tell you that the program has a bug, and that StataCorp will not be publishing the code until the bug is fixed!

How can I know that this program has a bug without even knowing what is being calculated? Let me show you the above results in %21x format:

+1.4ec89a0000000X+005
+1.b191480000000X+002
+1.8afcb20000000X-003
+1.99c2f60000000X+000

Do you see what I see? It’s all those zeros. In randomly drawn problems, it would be unlikely that there would be all zeros at the end of each result. What is likely is that the results were somehow rounded, and indeed they were. The rounding in this case was due to using float (4-byte) precision inadvertently. The programmer forgot to include a double in the ado-file.

And that’s one way %21x is used.

I am continually harping on programmers at StataCorp that if they are going to program binary computers, they need to think in binary. I go ballistic when I see a comparison that’s coded as “if (abs(x-y)<1e-8) …” in an attempt to deal with numerical inaccuracy. What kind of number is 1e-8? Well, it’s this kind of number:

. display %21x 1e-8
+1.5798ee2308c3aX-01b

Why put the computer to all that work, and exactly how many digits are you, the programmer, trying to ignore? Rather than 1e-8, why not use the “nice” numbers 7.451e-09 or 3.725e-09, which is to say, 1.0x-1b or 1.0x-1c? If you do that, then I can see exactly how many digits you are ignoring. If you code 1.0x-1b, I can see you are ignoring 1b=27 binary digits. If you code 1.0x-1c, I can see you are ignoring 1c=28 binary digits. Now, how many digits do you need to ignore? How imprecise do you really think your calculation is? By the way, Stata understands numbers such as 1.0x-1b and 1.0x-1c as input, so you can type the precise number you want.

As another example of thinking in binary, a StataCorp programmer once described a calculation he was making. At one point, the programmer needed to normalize a number in a particular way, and so calculated x/10^trunc(log10(x)), and held onto the 10^trunc(log10(x)) for denormalization later. Dividing by 10, 100, etc., may be easy for us humans, but it’s not easy in binary, and it can result in very small amounts of dreaded round-off error. And why even bother to calculate the log, which is an expensive operation? “Remember,” I said, “how floating-point numbers are recorded on a computer: z = a*2^b, where 0 < = |a| < 2. Writing in C, it’s easy to extract components. In fact, isn’t a number normalized to be between 0 and 2 even better for your purposes?” Yes, it turned out it was.

Even I sometimes forget to think in binary. Just last week I was working on a problem and Alan Riley suggested a solution. I thought a while. “Very clever,” I said. “Recasting the problem in powers of 2 will get rid of that divide that caused half the problem. Even so, there’s still the pesky subtraction.” Alan looked at me, imitating a look I so often give others. “In binary,” Alan patiently explained to me, “the difference you need is the last 19 bits of the original number. Just mask out the other digits.”

At this point, many of you may want to stop reading and go off and play with %21x. If you play with %21x long enough, you’ll eventually examine the relationship between numbers recorded as Stata floats and as Stata doubles, and you may discover something you think to be an error. I will discuss that next week in my next blog posting.