## Programming an estimation command in Stata: Handling factor variables in optimize()

\(

\newcommand{\xb}{{\bf x}}

\newcommand{\betab}{\boldsymbol{\beta}}\)I discuss a method for handling factor variables when performing nonlinear optimization using **optimize()**. After illustrating the issue caused by factor variables, I present a method and apply it to an example using **optimize()**.

This is the twenty post in the series **Programming an estimation command in Stata**. I recommend that you start at the beginning. See Programming an estimation command in Stata: A map to posted entries for a map to all the posts in this series.

**How poisson handles factor variables**

Consider the Poisson regression in which I include a full set of indicator variables created from the categorical variable **kids** and a constant term.

**Example 1: Collinear factor variables**

. clear all . use accident3 . poisson accidents cvalue ibn.kids traffic, coeflegend note: 3.kids omitted because of collinearity Iteration 0: log likelihood = -546.35782 Iteration 1: log likelihood = -545.11016 Iteration 2: log likelihood = -545.10898 Iteration 3: log likelihood = -545.10898 Poisson regression Number of obs = 505 LR chi2(5) = 361.62 Prob > chi2 = 0.0000 Log likelihood = -545.10898 Pseudo R2 = 0.2491 ------------------------------------------------------------------------------ accidents | Coef. Legend -------------+---------------------------------------------------------------- cvalue | -.6582924 _b[cvalue] | kids | 0 | 3.233932 _b[0bn.kids] 1 | 1.571582 _b[1.kids] 2 | 1.659241 _b[2.kids] 3 | 0 _b[3o.kids] | traffic | .1383977 _b[traffic] _cons | -2.518175 _b[_cons] ------------------------------------------------------------------------------

The full set of indicator variables is collinear with the constant term. The output shows that no variables were dropped but the name **3o.kids** specifies that **3.kids** was omitted. Omitted variables are not dropped; instead their coefficients are constrained to zero.

Specifying variables as omitted instead of dropping them allows postestimation features such as **margins** to work properly.

For the case in example 1, **poisson** is maximizing the log-likelihood function subject to constraint that \(\beta_{3.kids}=0\). In terms of the parameter vector \(\betab\), I represent this constraint by

\[

\left[\begin{matrix}

0&0&0&0&1&0&0

\end{matrix}\right]

\betab’ = 0

\]

where \(\betab=(

\betab_{cvalue},

\betab_{0.kids},

\betab_{1.kids},

\betab_{2.kids},

\betab_{3.kids},

\betab_{traffic},

\betab_{\_cons})\)

In general, I can represent \(q\) linear equality constraints on a \(1\times k\) parameter vector as

\[

{\bf C}\betab’ = {\bf c}

\]

where \({\bf C}\) is a \(q\times k\) matrix and \({\bf c}\) is a \(q\times 1\) vector. These constraints are conveniently represented as \(\widetilde{\bf C}=\left[{\bf C},{\bf c}\right]\).

I now show how to use **optimize()** to solve optimization problems subject to linear equality constraints by putting \(\widetilde{\bf C}\) into the **optimize** object. In code block 1, I use **optimize()** to maximize the Poisson log-likelihood function for the problem in example 1. Code block 1 augments example 3 in Programming an estimation command in Stata: Using optimize() to estimate Poisson parameters by using **optimize_init_constraints()** to impose a linear equality constraint on the coefficient vector.

**Code block 1: Linear equality constraints in optimize()**

mata: void plleval3(real scalar todo, real vector b, /// real vector y, real matrix X, /// val, grad, hess) { real vector xb xb = X*b' val = -exp(xb) + y:*xb - lnfactorial(y) } y = st_data(., "accidents") X = st_data(., "cvalue ibn.kids traffic") X = X,J(rows(X), 1, 1) C = e(5, 7) c = 0 Ct = C,c S = optimize_init() optimize_init_argument(S, 1, y) optimize_init_argument(S, 2, X) optimize_init_evaluator(S, &plleval3()) optimize_init_evaluatortype(S, "gf0") optimize_init_params(S, J(1, 7, .01)) optimize_init_constraints(S, Ct) bh = optimize(S) optimize_result_params(S) end

Only line 13, lines 16–18, and line 26 differ from the code in example 3 in Programming an estimation command in Stata: Using optimize() to estimate Poisson parameters. Line 13 illustrates that **st_data()** can create the indicator variables from the factor variable **ibn.kids**. Lines 16–18 define the constraint matrix \(\widetilde{\bf C}\) for this problem. Line 26 puts \(\widetilde{\bf C}\) into the **optimize()** object **S**, which causes **optimize()** to maximize the Poisson log-likelihood function with evaluator **plleval3()** subject to constraints specified in matrix **Ct**.

Example 2 illustrates that code block 1 reproduces the point estimates reported in example 1.

**Example 2: Linear equality constraints in optimize()**

. do pc . mata: ------------------------------------------------- mata (type end to exit) ----- : void plleval3(real scalar todo, real vector b, /// > real vector y, real matrix X, /// > val, grad, hess) > { > real vector xb > > xb = X*b' > val = -exp(xb) + y:*xb - lnfactorial(y) > } note: argument todo unused note: argument grad unused note: argument hess unused : : y = st_data(., "accidents") : X = st_data(., "cvalue ibn.kids traffic") : X = X,J(rows(X), 1, 1) : : C = e(5, 7) : c = 0 : Ct = C,c : : S = optimize_init() : optimize_init_argument(S, 1, y) : optimize_init_argument(S, 2, X) : optimize_init_evaluator(S, &plleval3()) : optimize_init_evaluatortype(S, "gf0") : optimize_init_params(S, J(1, 7, .01)) : optimize_init_constraints(S, Ct) : : bh = optimize(S) Iteration 0: f(p) = -845.47138 Iteration 1: f(p) = -572.68676 Iteration 2: f(p) = -545.68381 Iteration 3: f(p) = -545.11241 Iteration 4: f(p) = -545.10898 Iteration 5: f(p) = -545.10898 : optimize_result_params(S) 1 2 3 4 +------------------------------------------------------------- 1 | -.6582923624 3.233932519 1.571581623 1.65924145 +------------------------------------------------------------- 5 6 7 ----------------------------------------------+ 1 0 .13839766 -2.518174926 | ----------------------------------------------+ : end ------------------------------------------------------------------------------- . end of do-file

Code block 1 shows how to use a linear equality constraint to handle collinear variables when we know which variables are omitted. In the code for an estimation command, we must

- find which variables will be omitted, and
- create the constraint matrix \(\widetilde{\bf C}\) that imposes the constraints implied by omitting these variables.

Example 3 illustrates that **_rmcoll** stores a list of variables that identifies which variables will be omitted in **r(varlist)**, thereby solving problem 1.

**Example 3: Using _rmcoll to identify omitted variables**

. _rmcoll cvalue ibn.kids traffic, expand note: 3.kids omitted because of collinearity . return list scalars: r(k_omitted) = 1 macros: r(varlist) : "cvalue 0bn.kids 1.kids 2.kids 3o.kids traffic" . local cnames "`r(varlist)' _cons"

I specified the option **expand** so that **_rmcoll** would expand any factor variables. The expanded variable list in the local **r(varlist)** identifies **3.kids** as a variable that must be omitted. I then put this expanded variable list, augmented by the name **_cons**, in the local macro **cnames**.

Here is an outline for the solution to problem 2 that I present in examples 4–6.

- In example 4, I create the Stata vector
**bt**, whose column names are contained in**cnames**. - In example 5, I use
**_ms_omit_info**to create the Stata vector**bto**, which indicates which variables will be omitted from**bt**. - In example 6, I create a Mata matrix specifying the constraints from
**bto**.

Now for the details, beginning with example 4.

**Example 4: Putting the coefficient names on a Stata vector**

. matrix bt = J(1, 7, 0) . matrix colnames bt = `cnames' . matrix list bt bt[1,7] 0. 1. 2. 3o. cvalue kids kids kids kids traffic _cons r1 0 0 0 0 0 0 0

**cnames** contains the names of the coefficients for this problem, so I create a conformable row vector **bt**, make **cnames** the column names on **bt**, and display **bt**. The values in **bt** do not matter; the column names are the important information.

In example 5, **_ms_omit_info** uses the column names on a Stata vector to create the vector **r(omit)**, which specifies which variables are omitted.

**Example 5: Creating a vector that indicates omitted variables**

. matrix bt = J(1, 8, 0) . matrix colnames bt = `cnames' . _ms_omit_info bt . return list scalars: r(k_omit) = 1 matrices: r(omit) : 1 x 8 . matrix bto = r(omit) . matrix list bto bto[1,8] c1 c2 c3 c4 c5 c6 c7 c8 r1 0 0 0 0 1 0 0 0

An element of **r(omit)** is 1 if the corresponding variable is omitted. An element of **r(omit)** is 0 if the corresponding variable is not omitted. I put a copy of **r(omit)** in **bto**.

In example 6, I create a constraint matrix from **bto**. The loop in example 6 will create the constraint matrix implied by any **r(omit)** vector created by **_ms_omit_info**.

**Example 6: Creating a constraint matrix from r(omit)**

. mata: ------------------------------------------------- mata (type end to exit) ----- : mo = st_matrix("bto") : ko = sum(mo) : p = cols(mo) : if (ko>0) { > Cm = J(0, p, .) > for(j=1; j<=p; j++) { > if (mo[j]==1) { > Cm = Cm \ e(j, p) > } > } > Cm = Cm, J(ko, 1, 0) > } > else { > Cm = J(0,p+1,.) > } : "Constraint matrix is " Constraint matrix is : Cm 1 2 3 4 5 6 7 8 9 +-------------------------------------+ 1 | 0 0 0 0 1 0 0 0 0 | +-------------------------------------+ : end -------------------------------------------------------------------------------

After copying **bto** to the Mata vector **mo**, I put the number of constraints in the scalar **ko** and the number of parameters in **p**. If there are constraints, I initialize **Cm** to be a matrix with zero rows and **p** columns, use a **for** loop to iteratively append a new row corresponding to each constraint identified in **mo**, and finish by appending a **ko** \(\times\) 1 column of zeros on to **Cm**. If there are no constraints, I put a matrix with zero rows and **p+1** columns in **Cm**.

Regardless of whether there are any omitted variables, I can put the **Cm** matrix created by the method in example 6 into an **optimize()** object. If there are no omitted variables, **Cm** will have zero rows, and no constraints will be imposed. If there are omitted variables, **Cm** will have **ko** rows, and the constraints for the omitted variables will be imposed.

Code block 2 combines these pieces into a coherent example.

**Code block 2: Putting it all together**

clear all use accident3 local depvar "accidents" local indepvars "cvalue ibn.kids traffic" _rmcoll `indepvars', expand local cnames "`r(varlist)' _cons" local p : word count `cnames' matrix bt = J(1, `p', 0) matrix colnames bt = `cnames' _ms_omit_info bt matrix bto = r(omit) mata: void plleval3(real scalar todo, real vector b, /// real vector y, real matrix X, /// val, grad, hess) { real vector xb xb = X*b' val = -exp(xb) + y:*xb - lnfactorial(y) } y = st_data(., "`depvar'") X = st_data(., "`indepvars'") X = X,J(rows(X), 1, 1) mo = st_matrix("bto") ko = sum(mo) p = cols(mo) if (ko>0) { Ct = J(0, p, .) for(j=1; j<=p; j++) { if (mo[j]==1) { Ct = Ct \ e(j, p) } } Ct = Ct, J(ko, 1, 0) } else { Ct = J(0,p+1,.) } S = optimize_init() optimize_init_argument(S, 1, y) optimize_init_argument(S, 2, X) optimize_init_evaluator(S, &plleval3()) optimize_init_evaluatortype(S, "gf0") optimize_init_params(S, J(1, 7, .01)) optimize_init_constraints(S, Ct) bh = optimize(S) optimize_result_params(S) end

Lines 1–2 drop all the objects that I created in previous examples and read the **accident3** dataset into memory. Lines 3–4 create locals to hold the dependent variable and the independent variables.

Lines 5–6 use **_rmcoll** to identify which variables should be omitted and put the list of names in the local macro **cnames**, as in example 3. Lines 7–9 create **bt**, whose column names specify which variables should be omitted, as in example 4. Lines 10–11 create **bto**, whose entries specify which variables should be omitted, as in example 5. Lines

28–42 create the constraint matrix **Ct** from **bto**, as in example 6.

Line 50 puts **Ct** into the **optimize()** object.

Example 7 illustrates that the code in code block 2 reproduces the

previously obtained results.

**Example 7:Putting it all together**

. do pc2 . clear all . use accident3 . local depvar "accidents" . local indepvars "cvalue ibn.kids traffic" . _rmcoll `indepvars', expand note: 3.kids omitted because of collinearity . local cnames "`r(varlist)' _cons" . local p : word count `cnames' . matrix bt = J(1, `p', 0) . matrix colnames bt = `cnames' . _ms_omit_info bt . matrix bto = r(omit) . . mata: ------------------------------------------------- mata (type end to exit) ----- : void plleval3(real scalar todo, real vector b, /// > real vector y, real matrix X, /// > val, grad, hess) > { > real vector xb > > xb = X*b' > val = -exp(xb) + y:*xb - lnfactorial(y) > } note: argument todo unused note: argument grad unused note: argument hess unused : : y = st_data(., "`depvar'") : X = st_data(., "`indepvars'") : X = X,J(rows(X), 1, 1) : : mo = st_matrix("bto") : ko = sum(mo) : p = cols(mo) : if (ko>0) { > Ct = J(0, p, .) > for(j=1; j if (mo[j]==1) { > Ct = Ct \ e(j, p) > } > } > Ct = Ct, J(ko, 1, 0) > } > else { > Ct = J(0,p+1,.) > } : : S = optimize_init() : optimize_init_argument(S, 1, y) : optimize_init_argument(S, 2, X) : optimize_init_evaluator(S, &plleval3()) : optimize_init_evaluatortype(S, "gf0") : optimize_init_params(S, J(1, 7, .01)) : optimize_init_constraints(S, Ct) : : bh = optimize(S) Iteration 0: f(p) = -845.47138 Iteration 1: f(p) = -572.68676 Iteration 2: f(p) = -545.68381 Iteration 3: f(p) = -545.11241 Iteration 4: f(p) = -545.10898 Iteration 5: f(p) = -545.10898 : optimize_result_params(S) 1 2 3 4 +------------------------------------------------------------- 1 | -.6582923624 3.233932519 1.571581623 1.65924145 +------------------------------------------------------------- 5 6 7 ----------------------------------------------+ 1 0 .13839766 -2.518174926 | ----------------------------------------------+ : end ------------------------------------------------------------------------------- . end of do-file

**Done and undone**

I discussed a method for handling factor variables when performing nonlinear optimization using **optimize()**. In my next post, I implement these methods in an estimation command for Poisson regression.