Home > Programming > Programming an estimation command in Stata: Writing a Java plugin

Programming an estimation command in Stata: Writing a Java plugin

This post is the fourth in a series that illustrates how to plug code written in another language (like C, C++, or Java) into Stata. This technique is known as writing a plugin or as writing a dynamic-link library (DLL) for Stata.

In this post, I write a plugin in Java that implements the calculations performed by mymean_work() in mymean11.ado, discussed in Programming an estimation command in Stata: Preparing to write a plugin, and I assume that you are familiar with that material.

This post is analogous to Programming an estimation command in Stata: Writing a C plugin and to Programming an estimation command in Stata: Writing a C++ plugin. The differences are due to the plugin code being in Java instead of C or C++. I do not assume that you are familiar with the material in those posts, and much of that material is repeated here.

This is the 32nd post in the series Programming an estimation command in Stata. See Programming an estimation command in Stata: A map to posted entries for a map to all the posts in this series.

Writing a hello-world Java plugin

Before I do any computations, I illustrate how to write and to compile a Java plugin that communicates with Stata. Code block 1 contains the code for myhellojava.ado that calls a Java plugin that makes Stata display “Hello from Java”.

Code block 1: myhellojava.ado

*! version 1.0.0 14Feb2018
program myhellojava

       version 15.1

       javacall HelloFromJava helloJavaWork, jar(hellojavawork.jar)
end

In line 6, javacall executes the helloJavaWork method of the class HelloFromJava, which is in the JAR file hellojavawork.jar.

HelloFromJava.java in code block 2 contains the code for the HelloFromJava class.

Code block 2: HelloFromJava.java

// version 1.0.0 14Feb2018
import com.stata.sfi.*;
public class HelloFromJava {
        public static int helloJavaWork(String[] args) {
                SFIToolkit.displayln("Hello from Java") ;
                return(0) ;      // Stata return code
       }
}

Line 2 imports the Stata function interface (SFI) from sfi-api.jar, which I copied to the current directory from the Stata/utilities/jar directory distributed with Stata. You should copy the version installed with your Stata to the directory your Java compiler requires.

Line 3 defines the public class HelloFromJava, specified on line 6 of myhellojava.ado.

Line 4 defines the helloJavaWork() method, which is the entry point for the plugin. The signature of the entry method must be of this form. The method is public static. The method returns an int. The method accepts a String array.

Stata treats the returned int as a return code; zero means all went well and not zero identifies an error condition. Stata will exit with the error specified in the returned int if the returned int is not zero. The String array contains the arguments passed to the plugin by javacall.

Line 5 uses the SFI method SFIToolkit.displayln() to display the String “Hello from Java” with an additional line return.

Line 6 returns 0 to Stata, so Stata will not exit with an error code.

I now discuss how to create the JAR file hellojavawork.jar from HelloFromJava.java. I use the Java command-line tools to facilitate this discussion. See Working with Java plugins (Part 1) for details about using the Eclipse integrated development environment (IDE).

In the directory that contains myhellojava.ado and HelloFromJava.java, I also have the sfi-api.jar that I copied from the Stata/utilities/jar directory. On my OS X Mac that has the command-line developer tools installed, I use javac to create HelloFromJava.class from HelloFromJava.java and sfi-api.jar by typing

javac –release 8 -classpath sfi-api.jar HelloFromJava.java

At the time of this writing, Stata works with Java 8, even though Java 9 is having its initial release. I had to specify –release 8, because the command-line tools on my machine work with Java 9 by default. You can omit this option if javac defaults to Java 8 on your machine.

To create the JAR file hellojavawork.jar from HelloFromJava.class, I typed

jar cf hellojavawork.jar HelloFromJava.class

These commands for javac and jar work on all platforms, and you can distribute the jar file made on one platform to other platforms. This cross-platform compatibility is a major advantage of Java.

To ensure that the Stata command discard drops all the Java classes currently loaded into Stata, I also delete the .class compiled by javac before I run the ado-command that uses a Java class. On my OS X Mac, I type

rm HelloFromJava.class

Having created hellojavawork.jar and deleted HelloFromJava.class, I can execute myhellojava.ado in Stata.

Example 1: myhellocjava

. myhellojava
Hello from Java

If I change HelloFromJava.java, recompile, remake the JAR file, delete the .class file, and type discard in Stata before running myhellojava, Stata will find the new version of the Java class. discard works because Stata’s Java plugin interface uses a custom class loader instead of the Java-system class loader to load a plugin’s JAR file. A problem occurs when you leave your .class files in Stata’s current working directory, because the Java-system class loader will find and load the .class files before Stata’s custom class loader can act. This problem prevents Stata’s discard command from unloading the classes, which means that you must restart Stata to unload old class definitions and to load new versions. To prevent this problem, delete the .class files before calling your Java plugin. (Alternatively, you could work with your Java code outside of Stata’s current working directory, but I prefer deleting the .class files, because they are superfluous once I have the JAR files.)

For simplicity, I have sfi-api.jar, HelloFromjava.java, myhellojava.ado, and hellojavawork.jar in the same directory. For large projects, I would put the .ado and .jar files in directories on Stata’s ADOPATH and use my IDE to manage where I put sfi-api.jar and the Java source files. For the examples in this post, I put sfi-api.jar, all my .ado files, all my Java source files, and the created .jar files into a single directory.

Getting access to the Stata data in your plugin

helloJavaWork() makes Stata display something created inside the plugin. The next step is giving the plugin access to the data in Stata. To illustrate this process, I discuss mylistjava.ado, which uses a plugin to list out observations of the specified variables.

Let’s look at the ado-code first.

Code block 3: mylistjava.ado

*! version 1.0.0  14Feb2018
program define mylistjava

    version 15.1

    syntax varlist(numeric max=3) [if] [in]
    marksample touse

    display "Variables listed:  `varlist'"
    javacall MyListJava myListJW `varlist' if `touse' `in',  jar(mylistjw.jar)

end

In line 6, syntax creates three local macros. It puts the variables specified by the user into the local macro varlist. It puts any if condition specified by the user into the local macro if. It puts any in range specified by the user into the local macro in. I specified max=3 to syntax to limit the number of variables to 3. This limitation is silly, and I would not need it for an example Stata/Mata program, but it simplifies the example Java plugin.

In line 7, marksample creates a sample-inclusion variable, and it puts the name of the sample-inclusion variable in the local macro touse. The sample-inclusion variable is zero for each excluded observation, and it is one for each included observation. marksample uses the variables in the local macro varlist, the if condition in the local macro if, and the range in the local macro in to create the sample-inclusion variable. (All three local macros were created by syntax.) An observation is excluded if any of the variables in the local macro varlist contain a missing value, if it was excluded by the condition in the local macro if, or if it was excluded by the range in the local macro in. The sample-inclusion variable is one for observations that were not excluded.

In line 9, I further simplified the Java plugin by displaying the names of the variables whose values are listed out by the method plugin.

In line 10, javacall calls the plugin. The entry point is the method myListJW() in the class MyListJava, which is defined in the JAR file mylistjw.jar. Because `varlist’ is specified, SFI methods will be able to access the variables contained in the local macro varlist. Because if `touse’ is specified, the SFI method Data.isParsedIfTrue() will return zero if the sample-inclusion variable in `touse’ is zero, and it will return one if the sample-inclusion variable is one. Because `in’ is specified, the SFI methods Data.getObsParsedIn1() and Data.getObsParsedIn2() respectively return the first and the last observations in any user-specified in range.

Specifying `in’ is not necessary to identify the sample specified by the user, because if `touse’ already specifies this sample-inclusion information. However, specifying `in’ can dramatically reduce the range of observations in the loop over the data, thereby speeding up the code.

The code for MyListJava is in code block 4. In a directory that contains MyListJava.java and sfi-api.jar, I created mylistjw.jar on my Mac by typing the following three lines.

javac –release 8 -classpath sfi-api.jar MyListJava.java

jar cf mylistjw.jar MyListJava.class

rm MyListJava.class

Code block 4: MyListJava.java

// version 1.0.0 14Feb2018
import com.stata.sfi.*;
public class MyListJava {
    public static int myListJW(String[] args) {

// line will be displayed by Stata
        String line  ;

// Get number of variables in varlist specified to javacall
        int  nVariables = Data.getParsedVarCount();
// Get first observation specified by an in restriction
        long firstObs   = Data.getObsParsedIn1();
// Get last observation specified by an in restriction
        long lastObs    = Data.getObsParsedIn2();

// counter for numerber of obs in sample
        long nObs            = 0 ;
// Loop over observations
        for (long obs = firstObs; obs <= lastObs; obs++) {
            if (!Data.isParsedIfTrue(obs)) {
                        continue;
            }
// Increment counter
            ++nObs ;
            line = "" ;
// Loop over variables
            for (int j = 1; j <= nVariables; j++) {
                int varIndex = Data.mapParsedVarIndex(j);
                double value = Data.getNum(varIndex, obs);
                if (Data.isValueMissing(value)) {
                    line = "missing values encountered" ;
                    SFIToolkit.errorln(line);
                    return(416) ;
                }
                line += String.format("   %9s",
                    SFIToolkit.formatValue(value,  "%9.0g") );
            }
            SFIToolkit.displayln(line);
        }
        SFIToolkit.displayln("First observation was             " + firstObs) ;
        SFIToolkit.displayln("Last observation was              " + lastObs) ;
        SFIToolkit.displayln("Number of observations listed was " + nObs) ;

        return(0) ;
    }
}

If you are reading this post, you can read standard Java. I explain how MyListJava.java illustrates the structure of Java plugins for Stata, and I discuss the SFI methods used in the code. Complete details about the SFI are available at https://www.stata.com/java/api15/, which builds on the [P] java manual entry and the [P] java call manual entry.

myListJW.java returns zero to Stata if all went well, and it returns a nonzero error code if something went wrong. Because none of the methods called can fail, the only error condition addressed is encountering missing values, which is handled in lines 30–34. In the case of an error, line 32 uses SFIToolkit.errorln() to ensure that the error message is displayed by Stata and that it is displayed in red. SFIToolkit.display() is the standard display method used elsewhere in the code.

Java plugins read from or write to Stata objects using methods defined in the SFI. myListJW() does not return any results, so it has a simple structure.

  • It uses SFI methods to read from the specified sample of the data in Stata.
  • It uses standard Java and SFI methods to make Stata display observations on variables for the specified sample, and it keeps a counter of how many observations are in the specified sample.
  • It uses standard Java and SFI methods to display which was the first observation in the sample, which was the last observation in the sample, and how many observations were in the specified sample.

Now, I discuss specific parts of MyListJava.java.

Lines 10, 12, and 14 use methods of SFI Data class. Data.getParsedVarCount() puts the number of variables specified in the varlist into nVariables. Data.getObsParsedIn1() puts the first observation specified by an in range into firstObs. Data.getObsParsedIn2() puts the last observation specified by an in range into lastObs. If an in range was not specified to javacall, firstObs will contain 1, and lastObs will contain the number of observations in the dataset.

firstObs, lastObs, and all Java variables that hold Stata observation numbers are of type long, because Stata datasets can contain more observations than would fit into a Java variable of type int.

Lines 20–22 ensure that we skip over observations that were excluded by the if restriction specified to javacall in line 10 of mylistjava.ado. To illustrate some details, consider example 2.

Example 2: mylistjava

. sysuse auto, clear
(1978 Automobile Data)

. mylistjava mpg trunk rep78 if trunk < 21 in 2/10
Variables listed:  mpg trunk rep78
          17          11           3
          20          16           3
          15          20           4
          20          16           3
          16          17           3
          19          13           3
First observation was             2
Last observation was              10
Number of observations listed was 6

In line 20, Data.isParsedIfTrue(obs) returns one when the if restriction specified to javacall is one for observation obs, and it is zero otherwise. In line 10 of mylistjava.ado, we see that the if restriction passed to javacall is if `touse'. As discussed above, the sample-inclusion variable in the local macro touse is zero for excluded observations, and it is one for the included observations.

The in range on line 10 of mylistjava.ado was included so that the loop over the observations in line 19 of MyListJava.java would only go from the beginning to the end of any specified in range. In example 2, instead of looping over all 74 observations in the auto dataset, the loop on line 19 of MyListJava.java only goes from 2 to 10.

In example 2, the sample-inclusion variable is 1 for 6 observations, and it is 0 for the other 68 observations. The in 2/10 range excludes observation 1 and the observations from 11–74. Of the first 10 observations, 2 are excluded because rep78 is missing. One observation is excluded because trunk is 21.

For comparison, all 9 observations between 2 and 10 are listed in example 3.

Example 3: list

. list mpg trunk rep78 in 2/10, separator(0)

     +---------------------+
     | mpg   trunk   rep78 |
     |---------------------|
  2. |  17      11       3 |
  3. |  22      12       . |
  4. |  20      16       3 |
  5. |  15      20       4 |
  6. |  18      21       3 |
  7. |  26      10       . |
  8. |  20      16       3 |
  9. |  16      17       3 |
 10. |  19      13       3 |
     +---------------------+

Returning to MyListJava, we see that lines 28–29 illustrate how to put the value of a Stata numeric variable into a Java variable. Note that Data.getNum() returns a double for all Stata numeric variable types. In example 2, mpg, trunk, and rep78 are all of type int in Stata.

Lines 30–34 cause myListJW() to exit with error 416 if any observation in one of the variables contains a missing value. These lines are redundant, because the sample-inclusion variable in touse specified to javacall excluded observations containing missing values. I included these lines to illustrate how I would safely exclude missing values from inside the plugin and to reiterate that Java code must carefully deal with missing values. Stata missing values are valid double precision numbers in Java. You will get wrong results if you include Stata missing values in calculations.

Estimating the mean in a Java plugin

I now discuss the ado-command mymeanjava, which uses the myWork() method in the MyCalcs class to implement the calculations performed by mymean_work() in mymean11.ado, discussed in Programming an estimation command in Stata: Preparing to write a plugin.

The code for mymeanjava is in mymeanjava.ado, which is in code block 5.

Code block 5: mymeanjava.ado

*! version 1.0.0  14Feb2018
program define mymeanjava, eclass

    version 15.1

    syntax varlist(numeric) [if] [in]
    marksample touse
    tempname b V N

    javacall MyCalcs myWork `varlist' if `touse' `in',  ///
        jar(mycalcs.jar) args(`b' `V' `N')

    matrix colnames `b'  = `varlist'
    matrix colnames `V'  = `varlist'
    matrix rownames `V'  = `varlist'
    ereturn post `b' `V', esample(`touse')
    ereturn scalar   N   = `N'
    ereturn scalar df_r  = `N'-1
    ereturn display

end

The general structure of this program is similar to mymean10.ado and mymean11, discussed in Programming an estimation command in Stata: Preparing to write a plugin.

From a bird's-eye view, mymeanjava.ado

  • parses the user input;
  • creates a sample-inclusion variable;
  • creates temporary names for objects that will hold the results;
  • calls a work program to do the calculations;
  • stores the results returned by the work program in e(); and
  • displays the results.

The main difference between mymeanjava.ado and mymean11.ado is that the work program is a Java plugin instead of a Mata function.

Lines 6 and 7 are identical to those in mylistjava.ado. For a description of how these lines create the local macro varlist, the sample-inclusion variable contained in the local macro touse, and the local macro in that contains any user-specified in range, see the discussion of mylistjava.ado in Getting access to the Stata data in your plugin.

Line 8 puts temporary names into the local macros b, V, and N. We can use these names for results computed by the Java plugin and know that we will not overwrite any results that a user has stored in global Stata memory. (Recall that Stata matrices and scalars are global objects in Stata; see Using temporary names for global objects in Programming an estimation command in Stata: A first ado-command for a discussion of this topic.) In addition, Stata will drop the objects in the temporary names created by tempname when mymeanjava terminates.

Line 10 in mymeanjava is similar to its counterpart of line 10 in mylistjava.ado. In this case, myWork() is the entry method defined in the class MyCalcs, which is in the JAR file mycalcs.jar. The details of varlist, if `touse', and `in' were discussed above. What is new is that we use args(`b' `V' `N') to pass the temporary names to myWork().

The myWork(),

  • does the calculations;
  • puts the estimated means into a new Stata matrix whose name is in the local macro b;
  • puts the estimated variance–covariance of the estimator (VCE) into a new Stata matrix whose name is in the local macro V; and
  • puts the number of observations in the sample into the Stata scalar whose name is in the local macro N.

Lines 13–15 put the variable names on the column stripe of the vector of estimated means and on the row and column stripes of the VCE matrix. Lines 16–18 store the results in e(). Line 19 displays the results.

Before discussing the details of myWork(), let's create the plugin and run an example.

In a directory that contains MyCalcs.java, MyCalcsW.java, MyMatrix.java, MyLong.java, and sfi-api.jar, I created mycalcs.jar on my Mac by typing

javac --release 8 -classpath MyCalcs.java MyCalcsW.java MyMatrix.java MyLong.java sfi-api.jar

jar cf mycalcs.jar MyCalcs.class MyCalcsW.class MyMatrix.class MyLong.class

rm MyCalcs.class MyCalcsW.class MyMatrix.class MyLong.class

Having created mycalcs.jar, I ran example 3.

Example 4: mymeanjava

. mymeanjava mpg trunk rep78 in 1/60
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |     20.125   .6659933    30.22   0.000     18.79032    21.45968
       trunk |   14.42857   .5969931    24.17   0.000     13.23217    15.62497
       rep78 |   3.160714    .118915    26.58   0.000     2.922403    3.399025
------------------------------------------------------------------------------

I now discuss some aspects of the Java code, beginning with the class MyCalcs.java in code block 6.

Code block 6: MyCalcs.java

// version 1.0.0 14Feb2018
import com.stata.sfi.*;
public class MyCalcs {
    public static int myWork(String args[]) {
        int      rc ;
        MyMatrix bmat, vmat ;
        String   bname, vname, nname ;
        MyLong   nObs ;

        if (args.length < 3) {
            SFIToolkit.errorln("Too few arguments") ;
            return 198 ;
        }
        else {
            bname = args[0] ;
            vname = args[1] ;
            nname = args[2] ;
        }

        int  nVariables = Data.getParsedVarCount();
        long firstObs   = Data.getObsParsedIn1();
        long lastObs    = Data.getObsParsedIn2();

// create and initialize vector for sample averages
        bmat = new MyMatrix(1, nVariables);
// create and initialize matrix for VCE
        vmat = new MyMatrix(nVariables, nVariables);
// create and initialize MyLong for sample size
        nObs    = new MyLong(0) ;

// Put sample averages in bmat
        rc = MyCalcsW.myAve(bmat, firstObs, lastObs, nVariables, nObs) ;
        if (rc>0) return(rc) ;
// Put VCE in vmat
        rc = MyCalcsW.myV(bmat, vmat, firstObs, lastObs, nVariables, nObs) ;
        if (rc>0) return(rc) ;

// Copy sample average from bmat to Stata matrix bname
        rc = bmat.copyJavatoStataMatrix(bname) ;
        if (rc>0) return(rc) ;
// Copy VCE from vmat to Stata matrix vname
        rc = vmat.copyJavatoStataMatrix(vname) ;
        if (rc>0) return(rc) ;
// Copy sample size from n to Stata scalar nname
        rc = Scalar.setValue(nname, (double) nObs.getValue()) ;
        if (rc>0) return(rc) ;

        return(rc);
    }
}

MyCalcs.java only contains the entry method myWork(). In summary, myWork() performs the following tasks.

  1. It puts the names passed in as arguments into instances of Java String objects that can be passed to SFI methods.
  2. It puts the number of specified Stata variables into a Java variable used to loop over the variables.
  3. It puts the range of sample observations into Java variables used to
    loop over the observations.
  4. It creates the bmat and vmat instances of the MyMatrix class, which will hold the sample averages and the VCE.
  5. It creates the nObs instance of the MyLong class, which will hold the number of sample observations.
  6. It uses the methods MyCalcsW.myAve() and MyCalcsW.myV() to
    compute the results that are stored in bmat, vmat, and nObs.

  7. It uses the method CopyCtoStataMatrix() of the MyMatrix class to copy the results from bmat and vmat to new Stata matrices. The names of the new Stata matrices are the first and second arguments passed to myWork().
  8. It uses the SFI method Scalar.setValue() to copy the result from nObs to the new Stata scalar whose name was the third argument passed to myWork().

MyCalcs.java is easy to read, because I put all the details into the MyMatrix, MyCalcsW, and MyLong classes, which I discuss below.

Like all Java plugins for Stata, myWork() uses the return code rc to handle error conditions. Each method called returns zero if all went well, and it returns a nonzero error code if it could not perform the requested job. If the code returned is not zero, myWork() returns it immediately to Stata. The error messages associated with the error conditions are displayed by the methods.

In (3), I noted that bmat and vmat are instances of the MyMatrix class. The sample averages and the VCE are best stored in matrices. To keep things simple and self-contained, I defined a bare-bones matrix class MyMatrix that uses row-major storage and only the methods I needed. Except for the method copyJavatoStataMatrix(), the code for MyMatrix is standard Java, as can be seen in code block 7.

Code block 7: MyMatrix.java

// version 1.0.0 14Feb2018
// Notes: matrices are long vectors with row-major storage
//    The i,j element of an r x c matrix is 
//    the (i-1)*r + (j-1) element of the of the vector
//    under zero-base indexing
import com.stata.sfi.*;
public class MyMatrix {
    int        r, c, TotalSize ;
    double[]   mat ;

    public MyMatrix(int rows, int cols) {
        r         = rows ;
        c         = cols ;
        TotalSize = rows*cols ;
        mat       = new double[TotalSize] ;

        for(int i = 0; i<TotalSize; i++) {
            mat[i] = 0.0 ;
        }
    }

    void divideByScalar(double val) {
        int  i, j ;

        for(i=0; i<r; i++) {
            for(j=0; j<c; j++) {
                mat[i*r + j] /= val ;
            }
        }
    }

// Copy matrix object values to Stata matrix smname
    int copyJavatoStataMatrix(String smname) {
        int      i, j;
        int      rc_st ;
        double   val ;
        String   msg ;

// create new Stata matrix
        rc_st = Matrix.createMatrix(smname,r,c,0) ;
        if (rc_st>0) {
            SFIToolkit.errorln("cannot create Stata matrix " + smname) ;
            return(rc_st) ;
        }
        for(i=0; i<r; i++) {
            for(j=0; j<c; j++) {
                val = mat[i*r+j] ;
// Put values in new Stata matrix
                rc_st = Matrix.storeMatrixAt(smname, i, j, val) ;
                if(rc_st>0) {
                    msg =  "{err}cannot access Stata matrix " + smname ;
                    SFIToolkit.errorln(msg) ;
                    return(rc_st) ;
                }
            }
        }
        return(rc_st) ;
    }

    double getValue(int i, int j) {
        return( mat[i*r+j]) ;
    }
// Store val into (i,j)th element 
    void storeValue(int i, int j, double val) {
        mat[i*r+j] = val ;
    }
// Increment (i,j)th element  by val
    void incrementByValue(int i, int j, double val) {
        mat[i*r+j] += val ;
    }

}

Lines 33–58 contain the code for copyJavatoStataMatrix(). Lines 40 and 49 use SFI methods that I have not yet discussed. Matrix.createMatrix(String sname, int rows, int cols, double val) creates a new Stata matrix with rows rows and cols columns. Each element of this matrix is initialized to value val. sname contains the name of this Stata matrix.

Matrix.storeMatrixAt(String sname, int i, int j, double val) stores the value val in row i and column j of the Stata matrix whose name is contained in sname. The row i and column j are given in zero-based indexing.

In (4), I noted that I used an instance of the MyLong class to hold the number of sample observations. The primitive types in Java cannot be passed by reference, and the standard wrapper types are immutable, so I created to pass a long counter, nObs, to MyCalcsW.myAve(). When MyCalcsW.myAve() finishes, nObs contains the number of sample observations. The code for MyLong is standard Java, and it is given in code block 8.

Code block 8: MyLong.java

// version 1.0.0 14Feb2018
public class MyLong {
    private long value ;

    public MyLong(long j) {
        value = j ;
    }

    public long getValue() {
        return value ;
    }

    public void setValue(long j) {
        value = j;
    }

    public void incrementValue() {
        ++(value) ;
    }
}

In (5), I note that the methods MyCalcsW.myAve() and MyCalcsW.myV compute the sample averages and the VCE. These are methods in the class MyCalcsW, whose code is given in code block 9.

Code block 9: MyCalcsW.java

// version 1.0.0 14Feb2018
import com.stata.sfi.*;
public class MyCalcsW {

    public static int myAve( MyMatrix bmat, long firstObs, long lastObs,
        int nVariables, MyLong nObs) {

        int    rc, varIndex ;
        double value ;
        String msg ;

        rc = 0 ;
// Loop over observations
        for(long obs=firstObs; obs<=lastObs; obs++) {
            if (!Data.isParsedIfTrue(obs)) {
                continue;
            }
            nObs.incrementValue() ;

// Loop over variables
            for(int var = 1; var<=nVariables; var++) {
// get the real variable index for parsed variable -var-
                varIndex = Data.mapParsedVarIndex(var);
// Put value of observation obs on variable varIndex into value
                value    = Data.getNum(varIndex, obs);

// Exit with error 
                if (Data.isValueMissing(value)) {
                    msg = "{err}missing values encountered" ;
                    SFIToolkit.errorln(msg);
                    return(416) ;
                }
// Increment sample average vector
                bmat.incrementByValue(0, var-1, value) ;
            }
        }
// Divide sample average vector by nObs
        bmat.divideByScalar((double) nObs.getValue()) ;

        return (rc) ;
    }

    public static int myV( MyMatrix bmat, MyMatrix vmat, long firstObs,
        long lastObs, int nVariables, MyLong nObs) {

        int      rc, varIndex  ;
        MyMatrix emat ;
        double   value ;
        String   msg ;

        rc = 0 ;
// Create and initialized vector for observation level errors
        emat = new MyMatrix(1, nVariables);
// Loop over observations
        for(long obs=firstObs; obs<=lastObs; obs++) {
            if (!Data.isParsedIfTrue(obs)) {
                continue;
            }

// Loop over variables
            for(int var = 1; var<=nVariables; var++) {
// get the real variable index for parsed variable -var-
                varIndex = Data.mapParsedVarIndex(var);
// Put value of observation obs on variable varIndex into value
                value    = Data.getNum(varIndex, obs);

                if (Data.isValueMissing(value)) {
                    msg = "{err}missing values encountered" ;
                    SFIToolkit.errorln(msg);
                    return(416) ;
                }
                emat.storeValue(0, (var-1), bmat.getValue(0,(var-1)) - value) ;
            }

            for(int j = 0; j<nVariables; j++) {
                for(int j2 = 0; j2<=j; j2++) {
                    vmat.incrementByValue(j, j2,
                        (emat.getValue(0,j)*emat.getValue(0,j2)));
                }
            }
        }

        for(int j = 0; j<nVariables; j++) {
            for(int j2 = j+1 ; j2<nVariables; j2++) {
                vmat.storeValue(j, j2, vmat.getValue(j2, j) ) ;
            }
        }

        double n2 = (double) nObs.getValue() ;
        n2        = n2*(n2-1)  ;
        vmat.divideByScalar( n2 ) ;

        return(rc) ;
    }

}

MyCalsW.myAve() is a Java implementation of the Mata function MyAve(), discussed in Programming an estimation command in Stata: Preparing to write a plugin. It puts the sample averages into the bmat instance of the MyMatrix class, and it puts the number of observations in the sample into nObs. Most of the code for this method is standard Java or uses SFI methods that I have already discussed. Lines 18, 34, and 38 deserve comment.

Line 18 of MyCalcsW.java uses the method incrementValue() of MyLong to increment the number of observations stored in nObs. It increments the current value of nObs by one.

Line 34 uses the incrementByValue() method of MyMatrix. When calculating the sample average and storing it in the jth element of a vector named b, one needs to store b[j] + value into b[j]. In other words, one increments the amount of the jth element in b by value. bmat.incrementByValue(0,var-1, value) increments the element var-1 in bmat by value.

Line 38 uses the divideByScalar() method of MyMatrix. bmat.divideByScalar(z) replaces each element of bmat with that element divided by the amount z.

MyCalsW.myV() is a Java implementation of the Mata function MyV(), discussed in Programming an estimation command in Stata: Preparing to write a plugin. It puts the VCE into the vmat instance of the MyMatrix class. Most of the code for this method is standard Java or uses methods that I have already discussed. Lines 72, 77, and 85 use the MyMatrix methods storevalue() and getValue(). vmat.storeValue(i, j, z) stores the value z into element (i, j) of the vmat instance of MyMatrix. vmat.getValue(i, j) returns the value stored in element (i, j) of the vmat instance of MyMatrix.

Done and undone

I showed how to implement a Java plugin that does the calculations performed by Mata work functions in mymean10.ado and mymean11.ado, as discussed in Programming an estimation command in Stata: Preparing to write a plugin.

Thanks

Thanks to James Hassell of StataCorp for sharing some of his Java knowledge and experience.