## Exact matching on discrete covariates is the same as regression adjustment

I illustrate that exact matching on discrete covariates and regression adjustment (RA) with fully interacted discrete covariates perform the same nonparametric estimation.

**Comparing exact matching with RA**

A well-known example from the causal inference literature estimates the average treatment effect (ATE) of pregnant women smoking on the babies’ birth weights. Cattaneo (2010) discusses this example and I use an extract of his data. (My extract is not representative, and the results below only illustrate the methods I discuss.) See Wooldridge (2010, chap. 21) for an introduction to estimating an ATE.

The birth weight of the baby born to a mother is recorded in **bweight**. **mbsmoke** is the binary treatment indicating whether each woman smoked while she was pregnant. I also control for the women’s education (**medu**), a binary indicator for whether this was her first baby (**fbaby**), and a binary indicator for whether she was married (**mmarried**).

As is frequently the case, one of my control variables has too many categories for exact matching or to include as a categorical variable in fully interacted regression. In example 1, I impose a priori knowledge that allows me to combine 0–8 years of schooling into the “Before HS” category, 9–11 years into “In HS”, 12 into “HS”, and more than 12 into “HS+”, where HS stands for high school.

**Example 1: Cutting medu into four categories**

. generate medu2 = irecode(medu, 8, 11, 12) . label define ed2l 0 "before HS" 1 "in HS" 2 "HS" 3 "HS+" . label values medu2 ed2l

Exact matching requires that none of the cells formed by the treatment variable and the values for the discrete variables be empty. In example 2, I create **case**, which enumerates the set of possible covariate values, and then tabulate **case** over the treatment levels.

**Example 2: Tabulating covariate patterns by treatment level**

. egen case = group(medu2 fbaby mmarried) , label . tab case mbsmoke group(medu2 fbaby | 1 if mother smoked mmarried) | nonsmoker smoker | Total ----------------------+----------------------+---------- before HS No notmarri | 29 18 | 47 before HS No married | 63 4 | 67 before HS Yes notmarr | 29 12 | 41 before HS Yes married | 17 3 | 20 in HS No notmarried | 106 103 | 209 in HS No married | 76 53 | 129 in HS Yes notmarried | 173 62 | 235 in HS Yes married | 28 18 | 46 HS No notmarried | 197 119 | 316 HS No married | 706 163 | 869 HS Yes notmarried | 233 90 | 323 HS Yes married | 502 69 | 571 HS+ No notmarried | 77 25 | 102 HS+ No married | 812 58 | 870 HS+ Yes notmarried | 95 26 | 121 HS+ Yes married | 635 41 | 676 ----------------------+----------------------+---------- Total | 3,778 864 | 4,642

Some further consolidation might be required, because so few smokers with “before HS” education were married. There are only 4 treated cases with “before HS” education, not first baby, and married; there are only 3 treated cases with “before HS” education, first baby, and married. As I discuss in **Done and undone**, how I combine the categories is critical to obtaining consistent estimates. For this example, I leave the categories as previously defined and proceed to estimate the ATE by matching exactly on the covariates.

**Example 3: ATE estimated by exact matching on discrete covariates**

. teffects nnmatch (bweight ) (mbsmoke), ematch(medu2 fbaby mmarried) Treatment-effects estimation Number of obs = 4,642 Estimator : nearest-neighbor matching Matches: requested = 1 Outcome model : matching min = 3 Distance metric: Mahalanobis max = 812 ------------------------------------------------------------------------------ | AI Robust bweight | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- ATE | mbsmoke | (smoker | vs | nonsmoker) | -227.3809 26.99005 -8.42 0.000 -280.2804 -174.4813 ------------------------------------------------------------------------------

Exact matching with replacement compares each treated case with the mean of the not-treated cases with the same covariate pattern, and it compares each not-treated case with the mean of the treated cases with the same covariate pattern. The mean of the case-level comparisons estimates the ATE.

RA estimates the ATE by the difference between the averages of the predicted values for the treated and not-treated cases. With fully interacted discrete covariates, the predicted values are the outcome averages within each covariate pattern.

Example 4 illustrates that exact matching with replacement produces the same point estimates as RA with fully interacted discrete covariates.

**Example 4: ATE estimated by RA on discrete covariates**

. regress bweight ibn.mbsmoke#ibn.case, > noconstant vce(robust) vsquish Linear regression Number of obs = 4,642 F(32, 4610) = 5472.14 Prob > F = 0.0000 R-squared = 0.9731 Root MSE = 561.89 ------------------------------------------------------------------------------- | Robust bweight | Coef. Std. Err. t P>|t| [95% Conf. Interval] --------------+---------------------------------------------------------------- mbsmoke#case | nonsmoker #| before HS .. | 3412.345 85.26789 40.02 0.000 3245.179 3579.511 nonsmoker #| before HS .. | 3382.048 64.77681 52.21 0.000 3255.054 3509.041 nonsmoker #| before HS .. | 3095.897 121.4719 25.49 0.000 2857.753 3334.04 nonsmoker #| before HS .. | 3213.588 108.5406 29.61 0.000 3000.797 3426.38 nonsmoker #| in HS No n.. | 3219.255 66.9732 48.07 0.000 3087.955 3350.554 nonsmoker #| in HS No m.. | 3454.434 57.21777 60.37 0.000 3342.26 3566.608 nonsmoker #| in HS Yes .. | 3227.977 49.20252 65.61 0.000 3131.516 3324.437 nonsmoker #| in HS Yes .. | 3467.286 95.52026 36.30 0.000 3280.02 3654.551 nonsmoker #| HS No notm.. | 3327.249 45.20513 73.60 0.000 3238.625 3415.872 nonsmoker #| HS No marr.. | 3498.307 20.41325 171.37 0.000 3458.288 3538.327 nonsmoker #| HS Yes not.. | 3258.069 38.79208 83.99 0.000 3182.018 3334.12 nonsmoker #| HS Yes mar.. | 3382.054 24.69261 136.97 0.000 3333.644 3430.463 nonsmoker #| HS+ No not.. | 3227.597 80.73945 39.98 0.000 3069.309 3385.885 nonsmoker #| HS+ No mar.. | 3514.036 18.78391 187.08 0.000 3477.21 3550.861 nonsmoker #| HS+ Yes no.. | 3248.295 64.86602 50.08 0.000 3121.126 3375.463 nonsmoker #| HS+ Yes ma.. | 3441.787 21.05667 163.45 0.000 3400.506 3483.069 smoker #| before HS .. | 3181.111 105.5454 30.14 0.000 2974.192 3388.031 smoker #| before HS .. | 3373.75 229.6108 14.69 0.000 2923.603 3823.897 smoker #| before HS .. | 2924.333 139.0673 21.03 0.000 2651.695 3196.972 smoker #| before HS .. | 2863.333 93.69532 30.56 0.000 2679.646 3047.021 smoker #| in HS No n.. | 3038.68 59.37928 51.17 0.000 2922.268 3155.091 smoker #| in HS No m.. | 3115.698 58.70879 53.07 0.000 3000.601 3230.795 smoker #| in HS Yes .. | 3147.097 62.21084 50.59 0.000 3025.134 3269.06 smoker #| in HS Yes .. | 3353.889 111.5621 30.06 0.000 3135.174 3572.604 smoker #| HS No notm.. | 3061.437 60.37705 50.71 0.000 2943.069 3179.805 smoker #| HS No marr.. | 3184.221 47.77988 66.64 0.000 3090.549 3277.892 smoker #| HS Yes not.. | 3131.533 44.98026 69.62 0.000 3043.351 3219.716 smoker #| HS Yes mar.. | 3199.174 63.82476 50.12 0.000 3074.047 3324.301 smoker #| HS+ No not.. | 3002.36 89.60639 33.51 0.000 2826.689 3178.031 smoker #| HS+ No mar.. | 3199.707 82.92361 38.59 0.000 3037.137 3362.277 smoker #| HS+ Yes no.. | 3161.923 79.54319 39.75 0.000 3005.98 3317.866 smoker #| HS+ Yes ma.. | 3271.293 90.92146 35.98 0.000 3093.043 3449.542 ------------------------------------------------------------------------------- . margins r.mbsmoke , vce(unconditional) contrast(nowald) Contrasts of predictive margins Expression : Linear prediction, predict() ------------------------------------------------------------------------ | Unconditional | Contrast Std. Err. [95% Conf. Interval] -----------------------+------------------------------------------------ mbsmoke | (smoker vs nonsmoker) | -227.3809 26.82888 -279.9783 -174.7834 ------------------------------------------------------------------------

The 32 parameters estimated by **regress** are the means of the outcome for the 32 cases in the table in example 1. The standard errors reported by exact matching and RA are asymptotically equivalent but differ in finite samples.

The regression underlying RA with fully interacted discrete covariates is an interaction between the treatment factor with an interaction between all the discrete covariates. Example 5 illustrates that this regression produces the same results as example 4.

**Example 5: RA estimated with interactions**

. regress bweight ibn.mbsmoke#ibn.medu2#ibn.fbaby#ibn.mmarried, > noconstant vce(robust) vsquish Linear regression Number of obs = 4,642 F(32, 4610) = 5472.14 Prob > F = 0.0000 R-squared = 0.9731 Root MSE = 561.89 ------------------------------------------------------------------------------ | Robust bweight | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- mbsmoke#| medu2#fbaby#| mmarried | nonsmoker #| before HS #| No #| notmarried | 3412.345 85.26789 40.02 0.000 3245.179 3579.511 nonsmoker #| before HS #| No #| married | 3382.048 64.77681 52.21 0.000 3255.054 3509.041 nonsmoker #| before HS #| Yes #| notmarried | 3095.897 121.4719 25.49 0.000 2857.753 3334.04 nonsmoker #| before HS #| Yes #| married | 3213.588 108.5406 29.61 0.000 3000.797 3426.38 nonsmoker #| in HS #| No #| notmarried | 3219.255 66.9732 48.07 0.000 3087.955 3350.554 nonsmoker #| in HS #| No #| married | 3454.434 57.21777 60.37 0.000 3342.26 3566.608 nonsmoker #| in HS #| Yes #| notmarried | 3227.977 49.20252 65.61 0.000 3131.516 3324.437 nonsmoker #| in HS #| Yes #| married | 3467.286 95.52026 36.30 0.000 3280.02 3654.551 nonsmoker #| HS #| No #| notmarried | 3327.249 45.20513 73.60 0.000 3238.625 3415.872 nonsmoker #| HS #| No #| married | 3498.307 20.41325 171.37 0.000 3458.288 3538.327 nonsmoker #| HS #| Yes #| notmarried | 3258.069 38.79208 83.99 0.000 3182.018 3334.12 nonsmoker #| HS #| Yes #| married | 3382.054 24.69261 136.97 0.000 3333.644 3430.463 nonsmoker #| HS+ #| No #| notmarried | 3227.597 80.73945 39.98 0.000 3069.309 3385.885 nonsmoker #| HS+ #| No #| married | 3514.036 18.78391 187.08 0.000 3477.21 3550.861 nonsmoker #| HS+ #| Yes #| notmarried | 3248.295 64.86602 50.08 0.000 3121.126 3375.463 nonsmoker #| HS+ #| Yes #| married | 3441.787 21.05667 163.45 0.000 3400.506 3483.069 smoker #| before HS #| No #| notmarried | 3181.111 105.5454 30.14 0.000 2974.192 3388.031 smoker #| before HS #| No #| married | 3373.75 229.6108 14.69 0.000 2923.603 3823.897 smoker #| before HS #| Yes #| notmarried | 2924.333 139.0673 21.03 0.000 2651.695 3196.972 smoker #| before HS #| Yes #| married | 2863.333 93.69532 30.56 0.000 2679.646 3047.021 smoker #| in HS #| No #| notmarried | 3038.68 59.37928 51.17 0.000 2922.268 3155.091 smoker #| in HS #| No #| married | 3115.698 58.70879 53.07 0.000 3000.601 3230.795 smoker #| in HS #| Yes #| notmarried | 3147.097 62.21084 50.59 0.000 3025.134 3269.06 smoker #| in HS #| Yes #| married | 3353.889 111.5621 30.06 0.000 3135.174 3572.604 smoker #| HS #| No #| notmarried | 3061.437 60.37705 50.71 0.000 2943.069 3179.805 smoker #| HS #| No #| married | 3184.221 47.77988 66.64 0.000 3090.549 3277.892 smoker #| HS #| Yes #| notmarried | 3131.533 44.98026 69.62 0.000 3043.351 3219.716 smoker #| HS #| Yes #| married | 3199.174 63.82476 50.12 0.000 3074.047 3324.301 smoker #| HS+ #| No #| notmarried | 3002.36 89.60639 33.51 0.000 2826.689 3178.031 smoker #| HS+ #| No #| married | 3199.707 82.92361 38.59 0.000 3037.137 3362.277 smoker #| HS+ #| Yes #| notmarried | 3161.923 79.54319 39.75 0.000 3005.98 3317.866 smoker #| HS+ #| Yes #| married | 3271.293 90.92146 35.98 0.000 3093.043 3449.542 ------------------------------------------------------------------------------ . margins r.mbsmoke , vce(unconditional) contrast(nowald) Contrasts of predictive margins Expression : Linear prediction, predict() ------------------------------------------------------------------------ | Unconditional | Contrast Std. Err. [95% Conf. Interval] -----------------------+------------------------------------------------ mbsmoke | (smoker vs nonsmoker) | -227.3809 26.82888 -279.9783 -174.7834 ------------------------------------------------------------------------

Finally, I illustrate that **teffects ra** produces the same point estimates.

**Example 6: RA estimated by teffects**

. teffects ra (bweight bn.medu2#ibn.fbaby#ibn.mmarried, noconstant) (mbsmoke) Iteration 0: EE criterion = 2.010e-25 Iteration 1: EE criterion = 5.818e-26 Treatment-effects estimation Number of obs = 4,642 Estimator : regression adjustment Outcome model : linear Treatment model: none ------------------------------------------------------------------------------ | Robust bweight | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- ATE | mbsmoke | (smoker | vs | nonsmoker) | -227.3809 26.73625 -8.50 0.000 -279.783 -174.9788 -------------+---------------------------------------------------------------- POmean | mbsmoke | nonsmoker | 3402.793 9.59059 354.81 0.000 3383.995 3421.59 ------------------------------------------------------------------------------

The standard errors are asymptotically equivalent but differ in finite samples because **teffects** does adjust for the number of parameters estimated in the regression, as **regress** does.

**Done and undone**

I illustrated that exact matching on discrete covariates is the same as RA with fully interacted discrete covariates. Key to both methods is that the covariates are in fact discrete. If some collapsing of categories is performed as above, or if a discrete covariate is formed by cutting up a continuous covariate, all the results require that this combining step be performed correctly.

Exact matching on discrete covariates and RA with fully interacted discrete covariates perform the same nonparametric estimation. Collapsing categories or cutting up discrete covariates performs the same function as a bandwidth in nonparametric kernel regression; it determines which observations are comparable with each other. Just as with kernel regression, the bandwidth must be properly chosen to obtain consistent estimates.

__References__

Cattaneo, M. 2010. Effcient semiparametric estimation of multi-valued treatment effects under ignorability. *Journal of Econometrics* 155: 138–154.

Wooldridge, J. M. 2010. *Econometric Analysis of Cross Section and Panel Data*. 2nd ed. Cambridge, Massachusetts: MIT Press.