## Doctors versus policy analysts: Estimating the effect of interest

\(\newcommand{\Eb}{{\bf E}}\)The change in a regression function that results from an everything-else-held-equal change in a covariate defines an effect of a covariate. I am interested in estimating and interpreting effects that are conditional on the covariates and averages of effects that vary over the individuals. I illustrate that these two types of effects answer different questions. Doctors, parents, and consultants frequently ask individuals for their covariate values to make individual-specific recommendations. Policy analysts use a population-averaged effect that accounts for the variation of the effects over the individuals.

__Conditional on covariate effects after regress__

I have simulated data on a college-success index (**csuccess**) on 1,000 students that entered an imaginary university in the same year. Before starting his or her first year, each student took a short course that taught study techniques and new material; **iexam** records each student grade on the final for this course. I am interested in the effect of the **iexam** score on the mean of **csuccess** when I also condition on high-school grade-point average **hgpa** and SAT score **sat**. I include an interaction term, **it=iexam/(hgpa^2)**, in the regression to allow for the possibility that **iexam** has a smaller effect for students with a higher **hgpa**.

The regression below estimates the parameters of the conditional mean function that gives the mean of **csuccess** as a linear function of **hgpa**, **sat**, and **iexam**.

**Example 1: mean of csuccess given hgpa, sat, and iexam**

. regress csuccess hgpa sat iexam it, vce(robust) Linear regression Number of obs = 1,000 F(4, 995) = 384.34 Prob > F = 0.0000 R-squared = 0.5843 Root MSE = 1.3737 ---------------------------------------------------------------------------- | Robust csuccess | Coef. Std. Err. t P>|t| [95% Conf. Interval] -----------+---------------------------------------------------------------- hgpa | .7030099 .178294 3.94 0.000 .3531344 1.052885 sat | 1.011056 .0514416 19.65 0.000 .9101095 1.112002 iexam | .1779532 .0715848 2.49 0.013 .0374788 .3184276 it | 5.450188 .3731664 14.61 0.000 4.717904 6.182471 _cons | -1.434994 1.059799 -1.35 0.176 -3.514692 .644704 ----------------------------------------------------------------------------

The estimates imply that

\begin{align*}

\widehat{\Eb}[{\bf csuccess}| {\bf hgpa}, {\bf sat}, {\bf iexam}]

&=.70{\bf hgpa} + 1.01 {\bf sat} + 0.18 {\bf iexam} \\

&\quad + 5.45 {\bf iexam}/{(\bf hgpa^2)} – 1.43

\end{align*}

where \(\widehat{\Eb}[{\bf csuccess}| {\bf hgpa}, {\bf sat}, {\bf iexam}]\) denotes the estimated conditional mean function.

Because **sat** is measured in hundreds of points, the effect of a 100-point increase in **sat** is estimated to be

\begin{align*}

\widehat{\Eb}[{\bf csuccess}&| {\bf hgpa}, ({\bf sat}+1), {\bf iexam}]

–

\widehat{\Eb}[{\bf csuccess}| {\bf hgpa}, {\bf sat}, {\bf iexam}]\\

&=.70{\bf hgpa} + 1.01 ({\bf sat}+1) + 0.18 {\bf iexam} + 5.45 {\bf iexam}/{\bf hgpa^2} – 1.43 \\

&\hspace{1cm}- \left[.70{\bf hgpa} + 1.01 {\bf sat} + 0.18 {\bf iexam} + 5.45 {\bf iexam}/{\bf

hgpa^2} – 1.43 \right] \\

& = 1.01

\end{align*}

Note that the estimated effect of a 100-point increase in **sat** is a constant. The effect is also large, because the success index has a mean of 20.76 and a variance of 4.52; see example 2.

**Example 2: Marginal distribution of college-success index**

. summarize csuccess, detail csuccess ------------------------------------------------------------- Percentiles Smallest 1% 16.93975 16.16835 5% 17.71202 16.36104 10% 18.19191 16.53484 Obs 1,000 25% 19.25535 16.5457 Sum of Wgt. 1,000 50% 20.55144 Mean 20.76273 Largest Std. Dev. 2.126353 75% 21.98584 27.21029 90% 23.53014 27.33765 Variance 4.521379 95% 24.99978 27.78259 Skewness .6362449 99% 26.71183 28.43473 Kurtosis 3.32826

Because **iexam** is measured in tens of points, the effect of a 10-point increase in the **iexam** is estimated to be

\begin{align*}

\widehat{\Eb}[{\bf csuccess}&| {\bf hgpa}, {\bf sat}, ({\bf iexam}+1)]

–

\widehat{\Eb}[{\bf csuccess}| {\bf hgpa}, {\bf sat}, {\bf iexam}] \\

& =.70{\bf hgpa} + 1.01 {\bf sat} + 0.18 ({\bf iexam}+1) + 5.45 ({\bf iexam}+1)/{(\bf hgpa^2)} – 1.43 \\

&\hspace{1cm}

-\left[.70{\bf hgpa} + 1.01 {\bf sat} + 0.18 {\bf iexam} + 5.45 {\bf iexam})/{(\bf hgpa^2)} – 1.43 \right] \\

& = .18 + 5.45 /{\bf hgpa^2}

\end{align*}

The effect varies with a student’s high-school grade-point average, so the conditional-on-covariate interpretation differs from the population-averaged interpretation. For example, suppose that I am a counselor who believes that only increases of 0.7 or more in **csuccess** matter, and a student with an **hgpa** of 4.0 asks me if a 10-point increase on the **iexam** will significantly affect his or her college success.

After using **margins** in example 3 to estimate the effect of a 10-point increase in **iexam** for someone with an **hgpa=40**, I tell the student “probably not”. (The estimated effect is 0.52, and the estimated upper bound of the 95% confidence interval is 0.64.)

**Example 3: The effect of a 10-point increase in iexam when hgpa=4**

. margins, expression(_b[iexam] + _b[it]/(hgpa^2)) at(hgpa=4) Warning: expression() does not contain predict() or xb(). Predictive margins Number of obs = 1,000 Model VCE : Robust Expression : _b[iexam] + _b[it]/(hgpa^2) at : hgpa = 4 ---------------------------------------------------------------------------- | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -----------+---------------------------------------------------------------- _cons | .51859 .0621809 8.34 0.000 .3967176 .6404623 ----------------------------------------------------------------------------

After the student leaves, I run example 4 to estimate the effect of a 10-point increase in **iexam** when **hgpa** is 2, 2.5, 3, 3.5, and 4.

**Example 4: The effect of a 10-point increase in iexam when hgpa is 2, 2.5, 3, 3.5, and 4**

. margins, expression(_b[iexam] + _b[it]/(hgpa^2)) at(hgpa=(2 2.5 3 3.5 4)) Warning: expression() does not contain predict() or xb(). Predictive margins Number of obs = 1,000 Model VCE : Robust Expression : _b[iexam] + _b[it]/(hgpa^2) 1._at : hgpa = 2 2._at : hgpa = 2.5 3._at : hgpa = 3 4._at : hgpa = 3.5 5._at : hgpa = 4 ---------------------------------------------------------------------------- | Delta-method | Margin Std. Err. z P>|z| [95% Conf. Interval] -----------+---------------------------------------------------------------- _at | 1 | 1.5405 .0813648 18.93 0.000 1.381028 1.699972 2 | 1.049983 .0638473 16.45 0.000 .9248449 1.175122 3 | .7835297 .0603343 12.99 0.000 .6652765 .9017828 4 | .6228665 .0608185 10.24 0.000 .5036645 .7420685 5 | .51859 .0621809 8.34 0.000 .3967176 .6404623 ----------------------------------------------------------------------------

I use **marginsplot** to further clarify these results.

**Example 5: marginsplot**

. marginsplot, yline(.7) ylabel(.5 .7 1 1.5 2) Variables that uniquely identify margins: hgpa

I could not rule out the possibility that a 10-point increase in **iexam** would cause an increase of 0.7 in the average **csuccess** for a student with an **hgpa** of 3.5.

Consider the case in which \(\Eb[y|x,{\bf z}]\) is my regression model for the outcome \(y\) as a function of \(x\), whose effect I want to estimate, and \({\bf z}\), which are other variables on which I condition. The regression function \(\Eb[y|x,{\bf z}]\) tells me the mean of \(y\) for given values of \(x\) and \({\bf z}\).

The difference between the mean of \(y\) given \(x_1\) and \({\bf z}\) and the mean of \(y\) given \(x_0\) and \({\bf z}\) is an effect of \(x\), and it is given by \(\Eb[y|x=x_1,{\bf z}] – \Eb[y|x=x_0,{\bf z}]\). This effect can vary with \({\bf z}\); it might be scientifically and statistically significant for some values of \({\bf z}\) and not for others.

Under the usual assumption of correct specification, I can estimate the parameters of \(\Eb[y|x,{\bf z}]\) using **regress** or another command. I can then use **margins** and **marginsplot** to estimate effects of \(x\). (I also frequently use **lincom**, **nlcom**, and **predictnl** to estimate effects of \(x\) for given \({\bf z}\) values.)

**Population-averaged effects after regress**

Returning to the example, instead of being a counselor, suppose that I am a university administrator who believes that assigning enough tutors to the course will raise each student’s **iexam** score by 10 points. I begin by using **margins** to estimate the average college-success score that is observed when each student gets his or her current **iexam** score and to estimate the average college-success score that would be observed when each student gets an extra 10 points on his or her **iexam** score.

**Example 5: The average of csuccess with current iexam scores and when each student gets an extra 10 points**

. margins, at(iexam = generate(iexam)) > at(iexam = generate(iexam+1) it = generate((iexam+1)/(hgpa^2))) Predictive margins Number of obs = 1,000 Model VCE : Robust Expression : Linear prediction, predict() 1._at : iexam = iexam 2._at : iexam = iexam+1 it = (iexam+1)/(hgpa^2) ---------------------------------------------------------------------------- | Delta-method | Margin Std. Err. t P>|t| [95% Conf. Interval] -----------+---------------------------------------------------------------- _at | 1 | 20.76273 .0434416 477.95 0.000 20.67748 20.84798 2 | 21.48141 .0744306 288.61 0.000 21.33535 21.62747 ----------------------------------------------------------------------------

Just to make sure that I understand what **margins** is doing, I compute the average of the predicted values when each student gets his or her current **iexam** score and when each student gets an extra 10 points on his or her **iexam** score.

**Example 6: The average of csuccess with current iexam scores and when each student gets an extra 10 points (hand calculations)**

. preserve . predict double yhat0 (option xb assumed; fitted values) . replace iexam = iexam + 1 (1,000 real changes made) . replace it = (iexam)/(hgpa^2) (1,000 real changes made) . predict double yhat1 (option xb assumed; fitted values) . summarize yhat0 yhat1 Variable | Obs Mean Std. Dev. Min Max -------------+------------------------------------------------------- yhat0 | 1,000 20.76273 1.625351 17.33157 26.56351 yhat1 | 1,000 21.48141 1.798292 17.82295 27.76324 . restore

As expected, the average of the predictions for **yhat0** match those reported by **margins** for **_at.1**, and the average of the predictions for **yhat1** match those reported by **margins** for **_at.2**.

Now that I understand what **margins** is doing, I use the **contrast** option to estimate the difference between the average of **csuccess** when each student gets an extra 10 points and the average of **csuccess** when each student gets his or her original score.

**Example 7: The difference in the averages of csuccess when each student gets an extra 10 points and with current scores**

. margins, at(iexam = generate(iexam)) > at(iexam = generate(iexam+1) it = generate((iexam+1)/(hgpa^2))) > contrast(atcontrast(r._at) nowald) Contrasts of predictive margins Model VCE : Robust Expression : Linear prediction, predict() 1._at : iexam = iexam 2._at : iexam = iexam+1 it = (iexam+1)/(hgpa^2) -------------------------------------------------------------- | Delta-method | Contrast Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ _at | (2 vs 1) | .7186786 .0602891 .6003702 .836987 --------------------------------------------------------------

The standard error in example 7 is labeled as “Delta-method”, which means that it takes the covariate observations as fixed and accounts for the parameter estimation error. Holding the covariate observations as fixed gets me inference for this particular batch of students. I add the option **vce(unconditional)** in example 8, because I want inference for the population from which I can repeatedly draw samples of students.

**Example 8: The difference in the averages of csuccess with an unconditional standard error**

. margins, at(iexam = generate(iexam)) > at(iexam = generate(iexam+1) it = generate((iexam+1)/(hgpa^2))) > contrast(atcontrast(r._at) nowald) vce(unconditional) Contrasts of predictive margins Expression : Linear prediction, predict() 1._at : iexam = iexam 2._at : iexam = iexam+1 it = (iexam+1)/(hgpa^2) -------------------------------------------------------------- | Unconditional | Contrast Std. Err. [95% Conf. Interval] -------------+------------------------------------------------ _at | (2 vs 1) | .7186786 .0609148 .5991425 .8382148 --------------------------------------------------------------

In this case, the standard error for the sample effect reported in example 7 is about the same as the standard error for the population effect reported in example 8. With real data, the difference in these standard errors tends to be greater.

Recall the case in which \(\Eb[y|x,{\bf z}]\) is my regression model for the outcome \(y\) as a function of \(x\), whose effect I want to estimate, and \({\bf z}\), which are other variables on which I condition. The difference between the mean of \(y\) given \(x_1\) and the mean of \(y\) given \(x_0\) is an effect of \(x\) that has been averaged over the distribution of \({\bf z}\),

\[

\Eb[y|x=x_1] – \Eb[y|x=x_0] = \Eb_{\bf Z}\left[ \Eb[y|x=x_1,{\bf z}]\right] –

\Eb_{\bf Z}\left[ \Eb[y|x=x_0,{\bf z}]\right]

\]

Under the usual assumptions of correct specification, I can estimate the parameters of \(\Eb[y|x,{\bf z}]\) using **regress** or another command. I can then use **margins** and **marginsplot** to estimate a mean of these effects of \(x\). The sample must be representative, perhaps after weighting, in order for the estimated mean of the effects to converge to a population mean.

__Done and undone__

The change in a regression function that results from an everything-else-held-equal change in a covariate defines an effect of a covariate. I illustrated that when a covariate enters the regression function nonlinearly, the effect varies over covariate values, causing the conditional-on-covariate effect to differ from the population-averaged effect. I also showed how to estimate and interpret these conditional-on-covariate and population-averaged effects.