Home > Statistics > Doctors versus policy analysts: Estimating the effect of interest

Doctors versus policy analysts: Estimating the effect of interest

\(\newcommand{\Eb}{{\bf E}}\)The change in a regression function that results from an everything-else-held-equal change in a covariate defines an effect of a covariate. I am interested in estimating and interpreting effects that are conditional on the covariates and averages of effects that vary over the individuals. I illustrate that these two types of effects answer different questions. Doctors, parents, and consultants frequently ask individuals for their covariate values to make individual-specific recommendations. Policy analysts use a population-averaged effect that accounts for the variation of the effects over the individuals.

Conditional on covariate effects after regress

I have simulated data on a college-success index (csuccess) on 1,000 students that entered an imaginary university in the same year. Before starting his or her first year, each student took a short course that taught study techniques and new material; iexam records each student grade on the final for this course. I am interested in the effect of the iexam score on the mean of csuccess when I also condition on high-school grade-point average hgpa and SAT score sat. I include an interaction term, it=iexam/(hgpa^2), in the regression to allow for the possibility that iexam has a smaller effect for students with a higher hgpa.

The regression below estimates the parameters of the conditional mean function that gives the mean of csuccess as a linear function of hgpa, sat, and iexam.

Example 1: mean of csuccess given hgpa, sat, and iexam

. regress csuccess hgpa sat iexam it, vce(robust)

Linear regression                             Number of obs     =      1,000
                                              F(4, 995)         =     384.34
                                              Prob > F          =     0.0000
                                              R-squared         =     0.5843
                                              Root MSE          =     1.3737

----------------------------------------------------------------------------
           |               Robust
  csuccess |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-----------+----------------------------------------------------------------
      hgpa |   .7030099    .178294     3.94   0.000     .3531344    1.052885
       sat |   1.011056   .0514416    19.65   0.000     .9101095    1.112002
     iexam |   .1779532   .0715848     2.49   0.013     .0374788    .3184276
        it |   5.450188   .3731664    14.61   0.000     4.717904    6.182471
     _cons |  -1.434994   1.059799    -1.35   0.176    -3.514692     .644704
----------------------------------------------------------------------------

The estimates imply that

\begin{align*}
\widehat{\Eb}[{\bf csuccess}| {\bf hgpa}, {\bf sat}, {\bf iexam}]
&=.70{\bf hgpa} + 1.01 {\bf sat} + 0.18 {\bf iexam} \\
&\quad + 5.45 {\bf iexam}/{(\bf hgpa^2)} – 1.43
\end{align*}

where \(\widehat{\Eb}[{\bf csuccess}| {\bf hgpa}, {\bf sat}, {\bf iexam}]\) denotes the estimated conditional mean function.

Because sat is measured in hundreds of points, the effect of a 100-point increase in sat is estimated to be

\begin{align*}
\widehat{\Eb}[{\bf csuccess}&| {\bf hgpa}, ({\bf sat}+1), {\bf iexam}]

\widehat{\Eb}[{\bf csuccess}| {\bf hgpa}, {\bf sat}, {\bf iexam}]\\
&=.70{\bf hgpa} + 1.01 ({\bf sat}+1) + 0.18 {\bf iexam} + 5.45 {\bf iexam}/{\bf hgpa^2} – 1.43 \\
&\hspace{1cm}- \left[.70{\bf hgpa} + 1.01 {\bf sat} + 0.18 {\bf iexam} + 5.45 {\bf iexam}/{\bf
hgpa^2} – 1.43 \right] \\
& = 1.01
\end{align*}

Note that the estimated effect of a 100-point increase in sat is a constant. The effect is also large, because the success index has a mean of 20.76 and a variance of 4.52; see example 2.

Example 2: Marginal distribution of college-success index

. summarize csuccess, detail

                          csuccess
-------------------------------------------------------------
      Percentiles      Smallest
 1%     16.93975       16.16835
 5%     17.71202       16.36104
10%     18.19191       16.53484       Obs               1,000
25%     19.25535        16.5457       Sum of Wgt.       1,000

50%     20.55144                      Mean           20.76273
                        Largest       Std. Dev.      2.126353
75%     21.98584       27.21029
90%     23.53014       27.33765       Variance       4.521379
95%     24.99978       27.78259       Skewness       .6362449
99%     26.71183       28.43473       Kurtosis        3.32826

Because iexam is measured in tens of points, the effect of a 10-point increase in the iexam is estimated to be

\begin{align*}
\widehat{\Eb}[{\bf csuccess}&| {\bf hgpa}, {\bf sat}, ({\bf iexam}+1)]

\widehat{\Eb}[{\bf csuccess}| {\bf hgpa}, {\bf sat}, {\bf iexam}] \\
& =.70{\bf hgpa} + 1.01 {\bf sat} + 0.18 ({\bf iexam}+1) + 5.45 ({\bf iexam}+1)/{(\bf hgpa^2)} – 1.43 \\
&\hspace{1cm}
-\left[.70{\bf hgpa} + 1.01 {\bf sat} + 0.18 {\bf iexam} + 5.45 {\bf iexam})/{(\bf hgpa^2)} – 1.43 \right] \\
& = .18 + 5.45 /{\bf hgpa^2}
\end{align*}

The effect varies with a student’s high-school grade-point average, so the conditional-on-covariate interpretation differs from the population-averaged interpretation. For example, suppose that I am a counselor who believes that only increases of 0.7 or more in csuccess matter, and a student with an hgpa of 4.0 asks me if a 10-point increase on the iexam will significantly affect his or her college success.

After using margins in example 3 to estimate the effect of a 10-point increase in iexam for someone with an hgpa=40, I tell the student “probably not”. (The estimated effect is 0.52, and the estimated upper bound of the 95% confidence interval is 0.64.)

Example 3: The effect of a 10-point increase in iexam when hgpa=4

. margins, expression(_b[iexam] + _b[it]/(hgpa^2)) at(hgpa=4)
Warning: expression() does not contain predict() or xb().

Predictive margins                            Number of obs     =      1,000
Model VCE    : Robust

Expression   : _b[iexam] + _b[it]/(hgpa^2)
at           : hgpa            =           4

----------------------------------------------------------------------------
           |            Delta-method
           |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------+----------------------------------------------------------------
     _cons |     .51859   .0621809     8.34   0.000     .3967176    .6404623
----------------------------------------------------------------------------

After the student leaves, I run example 4 to estimate the effect of a 10-point increase in iexam when hgpa is 2, 2.5, 3, 3.5, and 4.

Example 4: The effect of a 10-point increase in iexam when hgpa is 2, 2.5, 3, 3.5, and 4

. margins, expression(_b[iexam] + _b[it]/(hgpa^2)) at(hgpa=(2 2.5 3 3.5 4))
Warning: expression() does not contain predict() or xb().

Predictive margins                            Number of obs     =      1,000
Model VCE    : Robust

Expression   : _b[iexam] + _b[it]/(hgpa^2)

1._at        : hgpa            =           2

2._at        : hgpa            =         2.5

3._at        : hgpa            =           3

4._at        : hgpa            =         3.5

5._at        : hgpa            =           4

----------------------------------------------------------------------------
           |            Delta-method
           |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------+----------------------------------------------------------------
       _at |
        1  |     1.5405   .0813648    18.93   0.000     1.381028    1.699972
        2  |   1.049983   .0638473    16.45   0.000     .9248449    1.175122
        3  |   .7835297   .0603343    12.99   0.000     .6652765    .9017828
        4  |   .6228665   .0608185    10.24   0.000     .5036645    .7420685
        5  |     .51859   .0621809     8.34   0.000     .3967176    .6404623
----------------------------------------------------------------------------

I use marginsplot to further clarify these results.

Example 5: marginsplot

. marginsplot, yline(.7) ylabel(.5 .7 1 1.5 2)

  Variables that uniquely identify margins: hgpa

graph1

I could not rule out the possibility that a 10-point increase in iexam would cause an increase of 0.7 in the average csuccess for a student with an hgpa of 3.5.

Consider the case in which \(\Eb[y|x,{\bf z}]\) is my regression model for the outcome \(y\) as a function of \(x\), whose effect I want to estimate, and \({\bf z}\), which are other variables on which I condition. The regression function \(\Eb[y|x,{\bf z}]\) tells me the mean of \(y\) for given values of \(x\) and \({\bf z}\).

The difference between the mean of \(y\) given \(x_1\) and \({\bf z}\) and the mean of \(y\) given \(x_0\) and \({\bf z}\) is an effect of \(x\), and it is given by \(\Eb[y|x=x_1,{\bf z}] – \Eb[y|x=x_0,{\bf z}]\). This effect can vary with \({\bf z}\); it might be scientifically and statistically significant for some values of \({\bf z}\) and not for others.

Under the usual assumption of correct specification, I can estimate the parameters of \(\Eb[y|x,{\bf z}]\) using regress or another command. I can then use margins and marginsplot to estimate effects of \(x\). (I also frequently use lincom, nlcom, and predictnl to estimate effects of \(x\) for given \({\bf z}\) values.)

Population-averaged effects after regress

Returning to the example, instead of being a counselor, suppose that I am a university administrator who believes that assigning enough tutors to the course will raise each student’s iexam score by 10 points. I begin by using margins to estimate the average college-success score that is observed when each student gets his or her current iexam score and to estimate the average college-success score that would be observed when each student gets an extra 10 points on his or her iexam score.

Example 5: The average of csuccess with current iexam scores and when each student gets an extra 10 points

. margins, at(iexam = generate(iexam))   
>         at(iexam = generate(iexam+1) it = generate((iexam+1)/(hgpa^2)))

Predictive margins                            Number of obs     =      1,000
Model VCE    : Robust

Expression   : Linear prediction, predict()

1._at        : iexam           = iexam

2._at        : iexam           = iexam+1
               it              = (iexam+1)/(hgpa^2)

----------------------------------------------------------------------------
           |            Delta-method
           |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-----------+----------------------------------------------------------------
       _at |
        1  |   20.76273   .0434416   477.95   0.000     20.67748    20.84798
        2  |   21.48141   .0744306   288.61   0.000     21.33535    21.62747
----------------------------------------------------------------------------

Just to make sure that I understand what margins is doing, I compute the average of the predicted values when each student gets his or her current iexam score and when each student gets an extra 10 points on his or her iexam score.

Example 6: The average of csuccess with current iexam scores and when each student gets an extra 10 points (hand calculations)

. preserve

. predict double yhat0
(option xb assumed; fitted values)

. replace iexam       = iexam + 1
(1,000 real changes made)

. replace it          = (iexam)/(hgpa^2)
(1,000 real changes made)

. predict double yhat1
(option xb assumed; fitted values)

. summarize yhat0 yhat1

    Variable |      Obs        Mean    Std. Dev.       Min        Max
-------------+-------------------------------------------------------
       yhat0 |    1,000    20.76273    1.625351   17.33157   26.56351
       yhat1 |    1,000    21.48141    1.798292   17.82295   27.76324

. restore

As expected, the average of the predictions for yhat0 match those reported by margins for _at.1, and the average of the predictions for yhat1 match those reported by margins for _at.2.

Now that I understand what margins is doing, I use the contrast option to estimate the difference between the average of csuccess when each student gets an extra 10 points and the average of csuccess when each student gets his or her original score.

Example 7: The difference in the averages of csuccess when each student gets an extra 10 points and with current scores

. margins, at(iexam = generate(iexam))   
>         at(iexam = generate(iexam+1) it = generate((iexam+1)/(hgpa^2))) 
>         contrast(atcontrast(r._at) nowald)

Contrasts of predictive margins
Model VCE    : Robust

Expression   : Linear prediction, predict()

1._at        : iexam           = iexam

2._at        : iexam           = iexam+1
               it              = (iexam+1)/(hgpa^2)

--------------------------------------------------------------
             |            Delta-method
             |   Contrast   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
         _at |
   (2 vs 1)  |   .7186786   .0602891      .6003702     .836987
--------------------------------------------------------------

The standard error in example 7 is labeled as “Delta-method”, which means that it takes the covariate observations as fixed and accounts for the parameter estimation error. Holding the covariate observations as fixed gets me inference for this particular batch of students. I add the option vce(unconditional) in example 8, because I want inference for the population from which I can repeatedly draw samples of students.

Example 8: The difference in the averages of csuccess with an unconditional standard error

. margins, at(iexam = generate(iexam))   
>         at(iexam = generate(iexam+1) it = generate((iexam+1)/(hgpa^2))) 
>         contrast(atcontrast(r._at) nowald) vce(unconditional)

Contrasts of predictive margins

Expression   : Linear prediction, predict()

1._at        : iexam           = iexam

2._at        : iexam           = iexam+1
               it              = (iexam+1)/(hgpa^2)

--------------------------------------------------------------
             |            Unconditional
             |   Contrast   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
         _at |
   (2 vs 1)  |   .7186786   .0609148      .5991425    .8382148
--------------------------------------------------------------

In this case, the standard error for the sample effect reported in example 7 is about the same as the standard error for the population effect reported in example 8. With real data, the difference in these standard errors tends to be greater.

Recall the case in which \(\Eb[y|x,{\bf z}]\) is my regression model for the outcome \(y\) as a function of \(x\), whose effect I want to estimate, and \({\bf z}\), which are other variables on which I condition. The difference between the mean of \(y\) given \(x_1\) and the mean of \(y\) given \(x_0\) is an effect of \(x\) that has been averaged over the distribution of \({\bf z}\),

\[
\Eb[y|x=x_1] – \Eb[y|x=x_0] = \Eb_{\bf Z}\left[ \Eb[y|x=x_1,{\bf z}]\right] –
\Eb_{\bf Z}\left[ \Eb[y|x=x_0,{\bf z}]\right]
\]

Under the usual assumptions of correct specification, I can estimate the parameters of \(\Eb[y|x,{\bf z}]\) using regress or another command. I can then use margins and marginsplot to estimate a mean of these effects of \(x\). The sample must be representative, perhaps after weighting, in order for the estimated mean of the effects to converge to a population mean.

Done and undone

The change in a regression function that results from an everything-else-held-equal change in a covariate defines an effect of a covariate. I illustrated that when a covariate enters the regression function nonlinearly, the effect varies over covariate values, causing the conditional-on-covariate effect to differ from the population-averaged effect. I also showed how to estimate and interpret these conditional-on-covariate and population-averaged effects.