Home > Statistics > Bayesian binary item response theory models using bayesmh

Bayesian binary item response theory models using bayesmh

This post was written jointly with Yulia Marchenko, Executive Director of Statistics, StataCorp.

Table of Contents

Overview
1PL model
2PL model
3PL model
4PL model
5PL model
Conclusion

Overview

Item response theory (IRT) is used for modeling the relationship between the latent abilities of a group of subjects and the examination items used for measuring their abilities. Stata 14 introduced a suite of commands for fitting IRT models using maximum likelihood; see, for example, the blog post Spotlight on irt by Rafal Raciborski and the [IRT] Item Response Theory manual for more details. In this post, we demonstrate how to fit Bayesian binary IRT models by using the redefine() option introduced for the bayesmh command in Stata 14.1. We also use the likelihood option dbernoulli() available as of the update on 03 Mar 2016 for fitting Bernoulli distribution. If you are not familiar with the concepts and jargon of Bayesian statistics, you may want to watch the introductory videos on the Stata Youtube channel before proceeding.

Introduction to Bayesian analysis, part 1 : The basic concepts
Introduction to Bayesian analysis, part 2: MCMC and the Metropolis-Hastings algorithm

We use the abridged version of the mathematics and science data from DeBoeck and Wilson (2004), masc1. The dataset includes 800 student responses to 9 test questions intended to measure mathematical ability.

The irt suite fits IRT models using data in the wide form – one observation per subject with items recorded in separate variables. To fit IRT models using bayesmh, we need data in the long form, where items are recorded as multiple observations per subject. We thus reshape the dataset in a long form: we have a single binary response variable, y, and two index variables, item and id, which identify the items and subjects, respectively. This allows us to formulate our IRT models as multilevel models. The following commands load and prepare the dataset.

. webuse masc1
(Data from De Boeck & Wilson (2004))

. generate id = _n

. quietly reshape long q, i(id) j(item)

. rename q y

To ensure that we include all levels of item and id in our models, we use fvset base none to keep the base categories.

. fvset base none id item

In what follows, we present eight Bayesian binary IRT models increasing in complexity and explanatory power. We perform Bayesian model comparison to gain insight into what would be the more appropriate model for the data at hand.

For high-dimensional models such as IRT models, you may see differences in the estimation results between different platforms or different flavors of Stata because of the nature of the Markov chain Monte Carlo (MCMC) sampling and finite numerical precision. These differences are not a source of concern; they will be within the range of the MCMC variability and will lead to similar inferential conclusions. The differences will diminish as the MCMC sample size increases. The results in this post are obtained from Stata/SE on the 64-bit Linux platform using the default 10,000 MCMC sample size.

Let the items be indexed by \(i=1,\dots,9\) and the subjects by \(j=1,\dots,800\). Let \(\theta_j\) be the latent mathematical ability of subject \(j\), and let \(Y_{ij}\) be the response of subject \(j\) to item \(i\).

Back to table of contents

1PL model

In the one-parameter logistic (1PL) model, the probability of getting a correct response is modeled as an inverse-logit function of location parameters \(b_i\), also called item difficulties, and a common slope parameter \(a\), also called item discrimination:

\[
P(Y_{ij}=1) = {\rm InvLogit}\{a(\theta_j-b_i)\} =
\frac{\exp\{a(\theta_j-b_i)\}}{1+\exp\{a(\theta_j-b_i)\}}
\]

Typically, the abilities are assumed to be normally distributed:
\[
\theta_j \sim {\rm N}(0,1)
\]
In a multilevel framework, the \(\theta_j\)’s represent random effects. In a Bayesian framework, we use the term “random effects” to refer to the parameters corresponding to levels of grouping variables identifying the hierarchy of the data.

A Bayesian formulation of the 1PL model also requires prior specification for the model parameters \(a\) and \(b_i\). The discrimination parameter \(a\) is assumed to be positive and is often modeled in the log scale. Because we have no prior knowledge about the discrimination and difficulty parameters, we assume that the prior distributions of \(\ln(a)\) and \(b_i\) have support on the whole real line, are symmetric, and are centered at 0. A normal prior distribution is thus a natural choice. We furthermore assume that \(\ln(a)\) and \(b_i\) are close to 0 and have prior variance of 1, which is an entirely subjective decision. We thus assign \(\ln(a)\) and \(b_i\) standard normal prior distributions:

\[\ln(a) \sim {\rm N}(0, 1)\] \[b_i \sim {\rm N}(0, 1) \]

To specify the likelihood function of the 1PL model in bayesmh, we use a nonlinear equation specification for the response variable y. The direct nonlinear specification for this model is

bayesmh y = ({discrim}*({subj:i.id}-{diff:i.item})), likelihood(logit) ...

where {discrim} is the discrimination parameter \(a\), {subj:i.id} are latent abilities \(\theta_j\), and {diff:i.item} are item difficulties \(b_i\). The logit model is used for the probability of a success, \(P(Y_{ij}=1)\). Specification {subj:i.id} in the above nonlinear expression is viewed as a substitutable expression for linear combinations of indicators associated with the id variable and parameters \(\theta_j\). This specification may be computationally prohibitive with a large number of subjects. A more efficient solution is to use the redefine() option to include subject random effects \(\theta_j\) in the model. The same argument may apply to the {diff:i.item} specification when there are many items. Thus, it may be computationally convenient to treat the \(b_i\) parameters as “random effects” in the specification and use the redefine() option to include them in the model.

A more efficient specification is thus

bayesmh y = ({discrim}*({subj:}-{diff:})), likelihood(logit) ///
               redefine(subj:i.id) redefine(diff:i.item) ... ///

where {subj:} and {diff:} in the nonlinear specification now represent the \(\theta_j\) and \(b_i\) parameters, respectively, without using expansions into linear combinations of indicator variables.

Below, we show the full bayesmh specification of the 1PL model and the output summary. In our examples, we treat the abilities {subj:i.id} as nuisance parameters and exclude them from the final results. The discrimination model parameter {discrim} must be positive and is thus initialized with 1. A longer burn-in period, burnin(5000), allows for longer adaptation of the MCMC sampler, which is needed given the large number of parameters in the model. Finally, the estimation results are stored for later model comparison.

. set seed 14

. bayesmh y = ({discrim}*({subj:}-{diff:})), likelihood(logit) ///
>         redefine(diff:i.item) redefine(subj:i.id)            ///
>         prior({subj:i.id},    normal(0, 1))                  ///
>         prior({discrim},      lognormal(0, 1))               ///
>         prior({diff:i.item},  normal(0, 1))                  ///
>         init({discrim} 1) exclude({subj:i.id})               ///
>         burnin(5000) saving(sim1pl, replace)
  
Burn-in ...
Simulation ...

Model summary
------------------------------------------------------------------------------
Likelihood: 
  y ~ logit({discrim}*(xb_subj-xb_diff))

Priors: 
  {diff:i.item} ~ normal(0,1)                                              (1)
    {subj:i.id} ~ normal(0,1)                                              (2)
      {discrim} ~ lognormal(0,1)
------------------------------------------------------------------------------
(1) Parameters are elements of the linear form xb_diff.
(2) Parameters are elements of the linear form xb_subj.

Bayesian logistic regression                     MCMC iterations  =     15,000
Random-walk Metropolis-Hastings sampling         Burn-in          =      5,000
                                                 MCMC sample size =     10,000
                                                 Number of obs    =      7,200
                                                 Acceptance rate  =      .3074
                                                 Efficiency:  min =     .02691
                                                              avg =     .06168
Log marginal likelihood =          .                          max =     .09527
 
------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Mean   Std. Dev.     MCSE     Median  [95% Cred. Interval]
-------------+----------------------------------------------------------------
diff         |
        item |
          1  | -.6934123   .0998543   .003576  -.6934789  -.8909473  -.4917364
          2  | -.1234553   .0917187   .002972  -.1241642  -.3030341   .0597863
          3  | -1.782762   .1323252    .00566  -1.781142   -2.05219  -1.534451
          4  |  .3152835   .0951978   .003289   .3154714   .1279147   .4981263
          5  |  1.622545    .127213   .005561   1.619388   1.377123   1.883083
          6  |  .6815517   .0978777   .003712   .6788345   .4911366    .881128
          7  |  1.303482   .1173994   .005021   1.302328   1.084295   1.544913
          8  | -2.353975   .1620307   .008062  -2.351207  -2.672983  -2.053112
          9  | -1.168668   .1120243   .004526  -1.163922  -1.392936  -.9549209
-------------+----------------------------------------------------------------
     discrim |  .8644787   .0439804   .002681   .8644331   .7818035   .9494433
------------------------------------------------------------------------------

file sim1pl.dta saved

. estimates store est1pl

The sampling efficiency is acceptable, about 6% on average, with no indication of convergence problems. Although detailed convergence inspection of all parameters is outside the scope of this post, we recommend that you do so by using, for example, the bayesgraph diagnostics command.

Though we used informative priors for the model parameters, the estimation results from our Bayesian model are not that different from the maximum likelihood estimates obtained using the irt 1pl command (see example 1 in [IRT] irt 1pl). For example, the posterior mean estimate for {discrim} is 0.86 with an MCMC standard error of 0.003, whereas irt 1pl reports 0.85 with a standard error of 0.05.

The log-marginal likelihood is reported missing because we have excluded the {subj:i.id} parameters from the simulation results and the Laplace-Metropolis estimator of the log-marginal likelihood is not available in such cases. This estimator requires simulation results for all model parameters to compute the log-marginal likelihood.

Back to table of contents

2PL model

The two-parameter logistic (2PL) model extends the 1PL model by allowing for item-specific discrimination. The probability of correct response is now modeled as a function of item-specific slope parameters \(a_i\):
\[
P(Y_{ij}=1) = {\rm InvLogit}\{a_i(\theta_j-b_i)\} =
\frac{\exp\{a_i(\theta_j-b_i)\}}{1+\exp\{a_i(\theta_j-b_i)\}}
\]

The prior specification for \(\theta_j\) stays the same as in the 1PL model. We will, however, apply more elaborate prior specifications for the \(a_i\)’s and \(b_i\)’s. It is a good practice to use proper prior specifications without overwhelming the evidence from the data. The impact of the priors can be controlled by introducing additional hyperparameters. For example, Kim and Bolt (2007) proposed the use of a normal prior for the difficulty parameters with unknown mean and variance. Extending this approach to the discrimination parameters as well, we apply a hierarchical Bayesian model in which the \(\ln(a_i)\) and \(b_i\) parameters have the following prior specifications:

\[ \ln(a_i) \sim {\rm N}(\mu_a, \sigma_a^2) \] \[ b_i \sim {\rm N}(\mu_b, \sigma_b^2) \]

The mean hyperparameters, \(\mu_a\) and \(\mu_b\), and variance hyperparameters, \(\sigma_a^2\) and \(\sigma_b^2\), require informative prior specifications. We assume that the means are centered at 0 with a variation of 0.1:
\[
\mu_a, \mu_b \sim {\rm N}(0, 0.1)
\]

To lower the variability of the \(\ln(a_i)\) and \(b_i\) parameters, we apply an inverse-gamma prior with shape 10 and scale 1 for the variance parameters:

\[
\sigma_a^2, \sigma_b^2 \sim {\rm InvGamma}(10, 1)
\]

Thus, the prior mean of \(\sigma_a^2\) and \(\sigma_b^2\) is about 0.1.

In the bayesmh specification, the hyperparameters \(\mu_a\), \(\mu_b\), \(\sigma_a^2\), and \(\sigma_a^2\) are denoted as {mu_a}, {mu_b}, {var_a}, and {var_b}, respectively. We use the redefine(discrim:i.item) option to include in the model the discrimination parameters \(a_i\), referred to as {discrim:} in the likelihood specification.

Regarding the MCMC simulation, we change some of the default options. The hyperparameters {mu_a}, {mu_b}, {var_a}, and {var_b} are placed in separate blocks to improve the simulation efficiency. The discrimination parameters {discrim:i.item} must be positive and are thus initialized with 1s.

. set seed 14

. bayesmh y = ({discrim:}*({subj:}-{diff:})), likelihood(logit) ///
>         redefine(discrim:i.item) redefine(diff:i.item)        ///
>         redefine(subj:i.id)                                   ///
>         prior({subj:i.id},      normal(0, 1))                 ///
>         prior({discrim:i.item}, lognormal({mu_a}, {var_a}))   ///
>         prior({diff:i.item},    normal({mu_b}, {var_b}))      ///
>         prior({mu_a} {mu_b},    normal(0, 0.1))               ///
>         prior({var_a} {var_b},  igamma(10, 1))                ///
>         block({mu_a mu_b var_a var_b}, split)                 ///
>         init({discrim:i.item} 1)                              ///
>         exclude({subj:i.id}) burnin(5000) saving(sim2pl, replace)
  
Burn-in ...
Simulation ...

Model summary
------------------------------------------------------------------------------
Likelihood: 
  y ~ logit(xb_discrim*(xb_subj-xb_diff))

Priors: 
  {discrim:i.item} ~ lognormal({mu_a},{var_a})                             (1)
     {diff:i.item} ~ normal({mu_b},{var_b})                                (2)
       {subj:i.id} ~ normal(0,1)                                           (3)

Hyperpriors: 
    {mu_a mu_b} ~ normal(0,0.1)
  {var_a var_b} ~ igamma(10,1)
------------------------------------------------------------------------------
(1) Parameters are elements of the linear form xb_discrim.
(2) Parameters are elements of the linear form xb_diff.
(3) Parameters are elements of the linear form xb_subj.

Bayesian logistic regression                     MCMC iterations  =     15,000
Random-walk Metropolis-Hastings sampling         Burn-in          =      5,000
                                                 MCMC sample size =     10,000
                                                 Number of obs    =      7,200
                                                 Acceptance rate  =      .3711
                                                 Efficiency:  min =     .01617
                                                              avg =     .04923
Log marginal likelihood =          .                          max =      .1698
 
------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Mean   Std. Dev.     MCSE     Median  [95% Cred. Interval]
-------------+----------------------------------------------------------------
discrim      |
        item |
          1  |  1.430976   .1986011   .010953   1.413063   1.089405   1.850241
          2  |  .6954823   .1081209   .004677   .6897267   .4985004   .9276975
          3  |  .9838528   .1343908   .009079   .9780275   .7506566   1.259427
          4  |  .8167792   .1169157   .005601   .8136229   .5992495   1.067578
          5  |  .9402715   .1351977   .010584   .9370298   .6691103   1.214885
          6  |  .9666747   .1420065   .008099   .9616285   .7038868   1.245007
          7  |  .5651287   .0864522   .006201   .5617302   .3956216   .7431265
          8  |  1.354053   .2048404   .015547   1.344227   .9791096   1.761437
          9  |  .7065096   .1060773   .006573   .6999745   .5102749   .9271799
-------------+----------------------------------------------------------------
diff         |
        item |
          1  | -.5070314   .0784172   .003565   -.507922   -.671257  -.3596057
          2  | -.1467198    .117422   .003143  -.1456633  -.3895978   .0716841
          3  | -1.630259   .1900103   .013494  -1.612534  -2.033169  -1.304171
          4  |  .3273735   .1073891   .003565   .3231703   .1248782   .5492114
          5  |  1.529584   .1969554    .01549   1.507982   1.202271   1.993196
          6  |  .6325194    .115724   .005613   .6243691   .4272131   .8851649
          7  |  1.827013   .2884057   .019582    1.79828   1.349654   2.490633
          8  | -1.753744   .1939559   .014743  -1.738199  -2.211475  -1.438146
          9  | -1.384486   .2059005   .012105  -1.361195  -1.838918  -1.059687
-------------+----------------------------------------------------------------
        mu_a | -.1032615   .1148176   .003874   -.102376  -.3347816   .1277031
       var_a |  .1129835   .0356735   .001269   .1056105    .063403   .1981331
        mu_b | -.0696525   .2039387   .004949   -.072602  -.4641566   .3298393
       var_b |  .6216005   .2023137   .008293   .5843444   .3388551   1.101153
------------------------------------------------------------------------------

file sim2pl.dta saved

. estimates store est2pl

The average simulation efficiency is about 5%, but some of the parameters converge slower than the others, such as {diff:7.item}, which has the largest MCMC standard error (0.02) among the difficulty parameters. If this was a rigorous study, to lower the MCMC standard errors, we would recommend longer simulations with MCMC sample sizes of at least 50,000.

We can compare the 1PL and 2PL models by using the deviance information criterion (DIC) available with the bayesstats ic command.

. bayesstats ic est1pl est2pl, diconly

Deviance information criterion

------------------------
             |       DIC
-------------+----------
      est1pl |  8122.428
      est2pl |  8055.005
------------------------

DIC is often used in Bayesian model selection as an alternative to AIC and BIC criteria and can be easily obtained from an MCMC sample. Larger MCMC samples produce more reliable DIC estimates. Because different MCMC samples produce different sample DIC values and the sample approximation error in calculating DIC is not known, one should not rely solely on DIC when choosing a model.

Lower DIC values indicate better fit. The DIC of the 2PL model (8,055) is markedly lower than the DIC of the 1PL model (8,122), implying better fit of the 2PL model.

Back to table of contents

3PL model

The three-parameter logistic (3PL) model introduces lower asymptote parameters \(c_i\), also called guessing parameters. The probability of giving a correct response is given by

\[
P(Y_{ij}=1) = c_i + (1-c_i){\rm InvLogit}\{a_i(\theta_j-b_i)\} ,\ c_i > 0
\]

The guessing parameters may be difficult to estimate using maximum likelihood. Indeed, the irt 3pl command with the sepguessing option fails to converge, as you can verify by typing

. irt 3pl q1-q9, sepguessing

on the original dataset.

It is thus important to specify an informative prior for \(c_i\). We assume that the prior mean of the guessing parameters is about 0.1 and thus apply
\[
c_i \sim {\rm InvGamma}(10, 1)
\]

Similarly to the discrimination and difficulty parameters, the \(c_i\)’s are introduced as random-effects parameters in the bayesmh specification and are referred to as {gues:} in the likelihood specification.

Unlike 1PL and 2PL models, we cannot use the likelihood(logit) option to model the probability of success because the probability of correct response is no longer an inverse-logit transformation of the parameters. Instead, we use likelihood(dbernoulli()) to model the probability of success of a Bernoulli outcome directly.

To have a valid initialization of the MCMC sampler, we assign the \(c_i\)’s positive starting values, 0.1.

. set seed 14

. bayesmh y, likelihood(dbernoulli({gues:}+(1-{gues:})*                     ///
>                                  invlogit({discrim:}*({subj:}-{diff:})))) ///
>         redefine(discrim:i.item) redefine(diff:i.item)                    ///
>         redefine(gues:i.item)    redefine(subj:i.id)                      ///
>         prior({subj:i.id},      normal(0, 1))                             ///
>         prior({discrim:i.item}, lognormal({mu_a}, {var_a}))               ///
>         prior({diff:i.item},    normal({mu_b}, {var_b}))                  ///
>         prior({gues:i.item},    igamma(10, 1))                            ///
>         prior({mu_a} {mu_b},    normal(0, 0.1))                           ///
>         prior({var_a} {var_b},  igamma(10, 1))                            ///
>         block({mu_a mu_b var_a var_b}, split)                             ///
>         init({discrim:i.item} 1 {gues:i.item} 0.1)                        ///
>         exclude({subj:i.id}) burnin(5000) saving(sim3pls, replace)
  
Burn-in ...
Simulation ...

Model summary
------------------------------------------------------------------------------
Likelihood: 
  y ~ binomial(xb_gues+(1-xb_gues)*invlogit(xb_discrim*(xb_subj-xb_diff)),1)

Priors: 
  {discrim:i.item} ~ lognormal({mu_a},{var_a})                             (1)
     {diff:i.item} ~ normal({mu_b},{var_b})                                (2)
     {gues:i.item} ~ igamma(10,1)                                          (3)
       {subj:i.id} ~ normal(0,1)                                           (4)

Hyperpriors: 
    {mu_a mu_b} ~ normal(0,0.1)
  {var_a var_b} ~ igamma(10,1)
------------------------------------------------------------------------------
(1) Parameters are elements of the linear form xb_discrim.
(2) Parameters are elements of the linear form xb_diff.
(3) Parameters are elements of the linear form xb_gues.
(4) Parameters are elements of the linear form xb_subj.

Bayesian Bernoulli model                         MCMC iterations  =     15,000
Random-walk Metropolis-Hastings sampling         Burn-in          =      5,000
                                                 MCMC sample size =     10,000
                                                 Number of obs    =      7,200
                                                 Acceptance rate  =      .3496
                                                 Efficiency:  min =      .0148
                                                              avg =     .03748
Log marginal likelihood =          .                          max =      .2044
 
------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Mean   Std. Dev.     MCSE     Median  [95% Cred. Interval]
-------------+----------------------------------------------------------------
discrim      |
        item |
          1  |  1.712831   .2839419   .018436   1.681216   1.232644   2.351383
          2  |  .8540871   .1499645   .008265   .8414399   .6058463   1.165732
          3  |  1.094723   .1637954    .01126   1.081756    .817031   1.454845
          4  |  1.090891   .2149095   .013977   1.064651   .7488589   1.588164
          5  |  1.363236   .2525573   .014858   1.338075   .9348136   1.954695
          6  |  1.388325   .3027436   .024245   1.336303   .9466695   2.068181
          7  |  .9288217   .2678741   .021626   .8750048   .5690308   1.603375
          8  |  1.457763   .2201065    .01809   1.438027   1.068937   1.940431
          9  |  .7873631    .127779   .007447   .7796568    .563821    1.06523
-------------+----------------------------------------------------------------
diff         |
        item |
          1  | -.2933734   .0976177   .006339  -.2940499  -.4879558  -.0946848
          2  |  .2140365    .157158   .008333   .2037788  -.0553537   .5550411
          3  | -1.326351   .1981196   .013101  -1.326817  -1.706671  -.9307443
          4  |  .6367877   .1486799   .007895   .6277349   .3791045   .9509913
          5  |  1.616056   .1799378    .00966   1.606213   1.303614   2.006817
          6  |  .8354059    .124184    .00656   .8191839    .614221   1.097801
          7  |  2.066205   .3010858   .018377   2.034757   1.554484   2.709601
          8  | -1.555583   .1671435   .012265   -1.54984   -1.89487  -1.267001
          9  | -.9775626   .2477279   .016722  -.9936727  -1.431964  -.4093629
-------------+----------------------------------------------------------------
gues         |
        item |
          1  |  .1078598   .0337844     .0019   .1020673   .0581353   .1929404
          2  |  .1128113   .0372217   .002162   .1065996   .0596554   .2082417
          3  |   .123031   .0480042   .002579   .1127147   .0605462   .2516237
          4  |  .1190103   .0390721   .002369   .1123544   .0617698   .2095427
          5  |  .0829503   .0185785   .001275   .0807116   .0514752   .1232547
          6  |  .1059315   .0289175   .001708   .1022741   .0584959   .1709483
          7  |  .1235553   .0382661   .002964   .1186648   .0626495   .2067556
          8  |  .1142118   .0408348   .001733   .1062507   .0592389   .2134006
          9  |  .1270767   .0557821   .003939    .113562   .0621876   .2825752
-------------+----------------------------------------------------------------
        mu_a |   .109161   .1218499   .005504   .1126253   -.135329   .3501061
       var_a |   .108864   .0331522   .001053   .1030106   .0604834   .1860996
        mu_b |  .0782094   .1974657   .004367   .0755023  -.3067717   .4638104
       var_b |  .5829738   .1803167   .006263   .5562159   .3260449   1.034225
------------------------------------------------------------------------------

file sim3pls.dta saved

. estimates store est3pls

The estimated posterior means of the \(c_i\)’s range between 0.08 and 0.13. Clearly, the introduction of guessing parameters has an impact on the item discrimination and difficulty parameters. For example, the estimated posterior means of \(\mu_a\) and \(\mu_b\) shift from -0.10 and -0.07, respectively, for the 2PL model to 0.11 and 0.08, respectively, for the 3PL model.

Because the estimated guessing parameters are not that different, one may ask whether item-specific guessing parameters are really necessary. To answer this question, we fit a model with a common guessing parameter, {gues}, and compare it with the previous model.

. set seed 14

. bayesmh y, likelihood(dbernoulli({gues}+(1-{gues})*                       ///
>                                  invlogit({discrim:}*({subj:}-{diff:})))) ///
>         redefine(discrim:i.item) redefine(diff:i.item)                    ///
>         redefine(subj:i.id)                                               ///
>         prior({subj:i.id},      normal(0, 1))                             ///
>         prior({discrim:i.item}, lognormal({mu_a}, {var_a}))               ///
>         prior({diff:i.item},    normal({mu_b}, {var_b}))                  ///
>         prior({gues},           igamma(10, 1))                            ///
>         prior({mu_a} {mu_b},    normal(0, 0.1))                           ///
>         prior({var_a} {var_b},  igamma(10, 1))                            ///
>         block({mu_a mu_b var_a var_b gues}, split)                        ///
>         init({discrim:i.item} 1 {gues} 0.1)                               ///
>         exclude({subj:i.id}) burnin(5000) saving(sim3pl, replace)
  
Burn-in ...
Simulation ...

Model summary
------------------------------------------------------------------------------
Likelihood: 
  y ~ binomial({gues}+(1-{gues})*invlogit(xb_discrim*(xb_subj-xb_diff)),1)

Priors: 
  {discrim:i.item} ~ lognormal({mu_a},{var_a})                             (1)
     {diff:i.item} ~ normal({mu_b},{var_b})                                (2)
       {subj:i.id} ~ normal(0,1)                                           (3)
            {gues} ~ igamma(10,1)

Hyperpriors: 
    {mu_a mu_b} ~ normal(0,0.1)
  {var_a var_b} ~ igamma(10,1)
------------------------------------------------------------------------------
(1) Parameters are elements of the linear form xb_discrim.
(2) Parameters are elements of the linear form xb_diff.
(3) Parameters are elements of the linear form xb_subj.

Bayesian Bernoulli model                         MCMC iterations  =     15,000
Random-walk Metropolis-Hastings sampling         Burn-in          =      5,000
                                                 MCMC sample size =     10,000
                                                 Number of obs    =      7,200
                                                 Acceptance rate  =      .3753
                                                 Efficiency:  min =     .01295
                                                              avg =     .03714
Log marginal likelihood =          .                          max =      .1874
 
------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Mean   Std. Dev.     MCSE     Median  [95% Cred. Interval]
-------------+----------------------------------------------------------------
discrim      |
        item |
          1  |  1.692894   .2748163   .021944   1.664569   1.232347   2.299125
          2  |  .8313512   .1355267    .00606   .8218212   .5928602   1.125729
          3  |  1.058833   .1611742   .014163   1.054126   .7676045   1.393611
          4  |  1.041808   .1718472   .008782   1.029867   .7398569   1.397073
          5  |  1.534997   .3208687   .023965   1.497019   1.019998   2.266078
          6  |   1.38296   .2581948   .019265   1.355706   .9559487   1.979358
          7  |  .8310222   .1698206   .012896   .8107371   .5736484   1.248736
          8  |  1.442949   .2266268   .017562   1.431204   1.066646   1.930829
          9  |    .77944   .1159669   .007266   .7750891   .5657258   1.014941
-------------+----------------------------------------------------------------
diff         |
        item |
          1  | -.3043161   .0859905   .005373  -.2968324  -.4870583  -.1407109
          2  |  .1814508   .1289251   .006543   .1832146  -.0723988   .4313265
          3  | -1.391216   .1924384   .014986  -1.373093  -1.809343  -1.050919
          4  |  .5928491   .1262631   .006721   .5829347    .356614    .857743
          5  |  1.617348   .1929263   .011604   1.601534   1.293032   2.061096
          6  |   .817635   .1172884   .006125    .812838   .5990503   1.064322
          7  |  2.006949   .2743517    .01785   1.981052   1.556682   2.594236
          8  | -1.576235   .1747855   .013455  -1.559435  -1.952676  -1.272108
          9  | -1.039362   .1840773    .01138   -1.02785  -1.432058  -.7160181
-------------+----------------------------------------------------------------
        gues |  .1027336   .0214544   .001753   .1022211   .0627299   .1466367
        mu_a |  .1009741    .123915   .006567   .0965353  -.1343028   .3510697
       var_a |  .1121003   .0344401   .001154   .1059563   .0628117   .1970842
        mu_b |  .0632173   .1979426   .004572   .0666684  -.3292497   .4482957
       var_b |  .5861236   .1818885   .006991   .5574743   .3239369   1.053172
------------------------------------------------------------------------------

file sim3pl.dta saved

. estimates store est3pl

We can again compare the two 3PL models by using the bayesstats ic command:

. bayesstats ic est3pls est3pl, diconly

Deviance information criterion

------------------------
             |       DIC
-------------+----------
     est3pls |  8049.425
      est3pl |  8049.426
------------------------

Although the estimated DICs of the two 3PL models are essentially the same, we decide for demonstration purposes to proceed with the model with item-specific guessing parameters.

Back to table of contents

4PL model

The four-parameter logistic (4PL) model extends the 3PL model by adding item-specific upper asymptote parameters \(d_i\):
\[
P(Y_{ij}=1) = c_i + (d_i-c_i){\rm InvLogit}\{a_i(\theta_j-b_i)\}
,\ c_i < d_i < 1 \] The \(d_i\) parameter can be viewed as an upper limit on the probability of correct response to the \(i\)th item. The probability of giving correct answers by subjects with very high ability can thus be no greater than \(d_i\). We restrict the \(d_i\)'s to the (0.8,1) range and assign them a \({\rm Uniform}(0.8,1)\) prior. For other parameters, we use the same priors as in the 3PL model. In the bayesmh specification of the model, the condition \(c_i < d_i\) is incorporated in the likelihood, and the condition \(d_i < 1\) is implied by the specified prior for the \(d_i\)'s. We initialize the \(d_i\)'s to 0.9. We use the notable option to suppress the long table output.

. set seed 14

. bayesmh y, likelihood(dbernoulli(({gues:}+({d:}-{gues:})*                 ///
>                                  invlogit({discrim:}*({subj:}-{diff:})))* ///
>                                  cond({gues:}<{d:},1,.)))                 ///
>         redefine(discrim:i.item) redefine(diff:i.item)                    ///
>         redefine(gues:i.item)    redefine(d:i.item)  redefine(subj:i.id)  ///
>         prior({subj:i.id},      normal(0, 1))                             ///
>         prior({discrim:i.item}, lognormal({mu_a}, {var_a}))               ///
>         prior({diff:i.item},    normal({mu_b}, {var_b}))                  ///
>         prior({gues:i.item},    igamma(10, 1))                            ///
>         prior({d:i.item},       uniform(0.8, 1))                          ///
>         prior({mu_a} {mu_b},    normal(0, 0.1))                           ///
>         prior({var_a} {var_b},  igamma(10, 1))                            ///
>         block({mu_a mu_b var_a var_b}, split)                             ///
>         init({discrim:i.item} 1 {gues:i.item} 0.1 {d:i.item} 0.9)         ///
>         exclude({subj:i.id}) burnin(5000) saving(sim4pls, replace) notable
  
Burn-in ...
Simulation ...

Model summary
------------------------------------------------------------------------------
Likelihood: 
  y ~ binomial(<expr1>,1)

Priors: 
  {discrim:i.item} ~ lognormal({mu_a},{var_a})                             (1)
     {diff:i.item} ~ normal({mu_b},{var_b})                                (2)
     {gues:i.item} ~ igamma(10,1)                                          (3)
        {d:i.item} ~ uniform(0.8,1)                                        (4)
       {subj:i.id} ~ normal(0,1)                                           (5)

Hyperpriors: 
    {mu_a mu_b} ~ normal(0,0.1)
  {var_a var_b} ~ igamma(10,1)

Expression: 
  expr1 : (xb_gues+(xb_d-xb_gues)*invlogit(xb_discrim*(xb_subj-xb_diff)))* con
          d(xb_gues<xb_d,1,.)
------------------------------------------------------------------------------
(1) Parameters are elements of the linear form xb_discrim.
(2) Parameters are elements of the linear form xb_diff.
(3) Parameters are elements of the linear form xb_gues.
(4) Parameters are elements of the linear form xb_d.
(5) Parameters are elements of the linear form xb_subj.

Bayesian Bernoulli model                         MCMC iterations  =     15,000
Random-walk Metropolis-Hastings sampling         Burn-in          =      5,000
                                                 MCMC sample size =     10,000
                                                 Number of obs    =      7,200
                                                 Acceptance rate  =      .3639
                                                 Efficiency:  min =    .004825
                                                              avg =      .0219
Log marginal likelihood =          .                          max =      .1203
Note: There is a high autocorrelation after 500 lags.

file sim4pls.dta saved

. estimates store est4pls
 

We use bayesstats summary to display results of selected model parameters.

. bayesstats summary {d:i.item} {mu_a var_a mu_b var_b}

Posterior summary statistics                      MCMC sample size =    10,000
 
------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Mean   Std. Dev.     MCSE     Median  [95% Cred. Interval]
-------------+----------------------------------------------------------------
d            |
        item |
          1  |  .9598183   .0255321   .001948   .9621874   .9044441   .9981723
          2  |  .9024564   .0565702   .007407   .9019505   .8066354   .9944216
          3  |  .9525519   .0281878   .002845   .9551054   .8972454   .9971564
          4  |  .8887963   .0561697   .005793   .8859503   .8036236   .9916784
          5  |  .8815547   .0588907   .007215   .8708021   .8031737   .9926549
          6  |  .8891188   .0586482   .006891    .881882   .8024593   .9935512
          7  |   .874271   .0561718   .008087   .8635082   .8018176   .9880433
          8  |  .9663644   .0147606   .001121   .9667563   .9370666   .9950912
          9  |   .889164   .0486038   .005524   .8834207   .8084921   .9857415
-------------+----------------------------------------------------------------
        mu_a |  .3336887   .1436216   .009742    .334092   .0562924   .6164115
       var_a |  .1221547   .0406908   .002376   .1144729   .0642768   .2229326
        mu_b | -.0407488   .1958039   .005645  -.0398847  -.4220523   .3323791
       var_b |  .4991736   .1612246    .00629   .4660071   .2802531   .9023824
------------------------------------------------------------------------------

The bayesmh command issued a note indicating high autocorrelation for some of the model parameters. This may be related to slower MCMC convergence or more substantial problems in the model specification. It is thus worthwhile to inspect the individual autocorrelation of the parameters. We can do so by using the bayesstats ess command. The parameters with lower estimated sample size (ESS) have higher autocorrelation and vice versa.

. bayesstats ess {d:i.item} {mu_a var_a mu_b var_b}

Efficiency summaries    MCMC sample size =    10,000
 
----------------------------------------------------
             |        ESS   Corr. time    Efficiency
-------------+--------------------------------------
d            |
        item |
          1  |     171.82        58.20        0.0172
          2  |      58.33       171.43        0.0058
          3  |      98.17       101.87        0.0098
          4  |      94.02       106.36        0.0094
          5  |      66.62       150.11        0.0067
          6  |      72.44       138.05        0.0072
          7  |      48.25       207.26        0.0048
          8  |     173.30        57.70        0.0173
          9  |      77.41       129.19        0.0077
-------------+--------------------------------------
        mu_a |     217.35        46.01        0.0217
       var_a |     293.34        34.09        0.0293
        mu_b |    1203.20         8.31        0.1203
       var_b |     656.92        15.22        0.0657
----------------------------------------------------

We observe that the parameters with ESS lower than 200 are among the asymptote parameter’s \(d_i\)’s. This may be caused, for example, by overparameterization of the likelihood model and subsequent nonidentifiability, which is not resolved by the specified priors.

We can also fit a model with a common upper asymptote parameter, \(d\), and compare it with the model with the item-specific upper asymptote.

. set seed 14

. bayesmh y, likelihood(dbernoulli(({gues:}+({d}-{gues:})*                  ///
>                                  invlogit({discrim:}*({subj:}-{diff:})))* ///
>                                  cond({gues:}<{d},1,.)))                  ///
>         redefine(discrim:i.item) redefine(diff:i.item)                    ///
>         redefine(gues:i.item)    redefine(subj:i.id)                      ///
>         prior({subj:i.id},      normal(0, 1))                             ///
>         prior({discrim:i.item}, lognormal({mu_a}, {var_a}))               ///
>         prior({diff:i.item},    normal({mu_b}, {var_b}))                  ///
>         prior({gues:i.item},    igamma(10, 1))                            ///
>         prior({d},              uniform(0.8, 1))                          ///
>         prior({mu_a} {mu_b},    normal(0, 0.1))                           ///
>         prior({var_a} {var_b},  igamma(10, 1))                            ///
>         block({mu_a mu_b var_a var_b d}, split)                           ///
>         init({discrim:i.item} 1 {gues:i.item} 0.1 {d} 0.9)                ///
>         exclude({subj:i.id}) burnin(5000) saving(sim4pl, replace) notable
  
Burn-in ...
Simulation ...

Model summary
------------------------------------------------------------------------------
Likelihood: 
  y ~ binomial(<expr>>,1)

Priors: 
  {discrim:i.item} ~ lognormal({mu_a},{var_a})                             (1)
     {diff:i.item} ~ normal({mu_b},{var_b})                                (2)
     {gues:i.item} ~ igamma(10,1)                                          (3)
       {subj:i.id} ~ normal(0,1)                                           (4)
               {d} ~ uniform(0.8,1)

Hyperpriors: 
    {mu_a mu_b} ~ normal(0,0.1)
  {var_a var_b} ~ igamma(10,1)

Expression: 
  expr1 : (xb_gues+({d}-xb_gues)*invlogit(xb_discrim*(xb_subj-xb_diff)))* cond
          (xb_gues<{d},1,.)
------------------------------------------------------------------------------
(1) Parameters are elements of the linear form xb_discrim.
(2) Parameters are elements of the linear form xb_diff.
(3) Parameters are elements of the linear form xb_gues.
(4) Parameters are elements of the linear form xb_subj.

Bayesian Bernoulli model                         MCMC iterations  =     15,000
Random-walk Metropolis-Hastings sampling         Burn-in          =      5,000
                                                 MCMC sample size =     10,000
                                                 Number of obs    =      7,200
                                                 Acceptance rate  =      .3877
                                                 Efficiency:  min =      .0107
                                                              avg =     .03047
Log marginal likelihood =          .                          max =      .1626

file sim4pl.dta saved

. estimates store est4pl

. bayesstats summary {d mu_a var_a mu_b var_b}

Posterior summary statistics                      MCMC sample size =    10,000
 
------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Mean   Std. Dev.     MCSE     Median  [95% Cred. Interval]
-------------+----------------------------------------------------------------
           d |  .9664578   .0144952   .001293   .9668207   .9371181   .9924572
        mu_a |  .2206696   .1387873    .01113   .2208302  -.0483587   .4952625
       var_a |  .1245785   .0391551   .001806   .1188779   .0658243   .2187058
        mu_b |  .0371722   .2020157    .00501   .0331742  -.3481366   .4336587
       var_b |  .5603447   .1761812   .006817   .5279243   .3157048   .9805077
------------------------------------------------------------------------------

We now compare the two 4PL models by using the bayesstats ic command:

. bayesstats ic est4pls est4pl, diconly

Deviance information criterion

------------------------
             |       DIC
-------------+----------
     est4pls |  8050.805
      est4pl |  8037.075
------------------------

The DIC of the more complex 4PL model (8,051) is substantially higher than the DIC of the simpler model (8,037). This and the potential nonidentifiability of the more complex est4pls model, indicated by high autocorrelation in the simulated MCMC sample, compel us to proceed with the model with a common upper asymptote, est4pl.

The posterior distribution of \(d\) has an estimated 95% equal-tailed credible interval of (0.93, 0.99) and is concentrated about 0.97. The \({\rm Uniform}(0.8,1)\) prior on \(d\) does not seem to be too restrictive. The estimated DIC of the est4pl model (8,037) is lower than the DIC of the est3pls 3PL model from the previous section (8,049), implying that the introduction of the upper asymptote parameter \(d\) does improve the model fit.

Back to table of contents

5PL model

The five-parameter logistic (5PL) model extends the 4PL model by adding item-specific asymmetry parameters \(e_i\):
\[
P(Y_{ij}=1) = c_i + (d_i-c_i){\rm InvLogit}\big[{\{a_i(\theta_j-b_i)\}}^{e_i}\big]
,\ c_i < d_i < 1, \ 0 < e_i < 1 \] In the previous section, we found the 4PL model with common upper asymptote \(d\), est4pl, to be the best one so far. We thus consider here a 5PL model with common upper asymptote \(d\).

Typically, we expect the \(e_i\) parameters to be close to 1. Similarly to the upper asymptote parameter \(d\), the \(e_i\) parameters are assumed to be in the (0.8,1) range and are assigned \({\rm Uniform}(0.8,1)\) prior. We initialize the \(e_i\)s to 0.9. We again use the notable option to suppress the long table output, and we display a subset of results by using bayesstats summary. (We could have used bayesmh's noshow() option instead to achieve the same result.)

. set seed 14

. bayesmh y, likelihood(dbernoulli(({gues:}+({d}-{gues:})*                  ///
>                           (invlogit({discrim:}*({subj:}-{diff:})))^{e:})* ///
>                           cond({gues:}<{d},1,.)))                         ///
>         redefine(discrim:i.item) redefine(diff:i.item)                    ///
>         redefine(gues:i.item)    redefine(e:i.item)  redefine(subj:i.id)  ///
>         prior({subj:i.id},      normal(0, 1))                             ///
>         prior({discrim:i.item}, lognormal({mu_a}, {var_a}))               ///
>         prior({diff:i.item},    normal({mu_b}, {var_b}))                  ///
>         prior({gues:i.item},    igamma(10, 1))                            ///
>         prior({d},              uniform(0.8, 1))                          ///
>         prior({e:i.item},       uniform(0.8, 1))                          ///
>         prior({mu_a} {mu_b},    normal(0, 0.1))                           ///
>         prior({var_a} {var_b},  igamma(10, 1))                            ///
>         block({mu_a mu_b var_a var_b d}, split)                           ///
>         init({discrim:i.item} 1 {gues:i.item} 0.1 {d} {e:i.item} 0.9)     ///
>         exclude({subj:i.id}) burnin(5000) saving(sim5pls, replace) notable
  
Burn-in ...
Simulation ...

Model summary
------------------------------------------------------------------------------
Likelihood: 
  y ~ binomial(<expr1>,1)

Priors: 
  {discrim:i.item} ~ lognormal({mu_a},{var_a})                             (1)
     {diff:i.item} ~ normal({mu_b},{var_b})                                (2)
     {gues:i.item} ~ igamma(10,1)                                          (3)
        {e:i.item} ~ uniform(0.8,1)                                        (4)
       {subj:i.id} ~ normal(0,1)                                           (5)
               {d} ~ uniform(0.8,1)

Hyperpriors: 
    {mu_a mu_b} ~ normal(0,0.1)
  {var_a var_b} ~ igamma(10,1)

Expression: 
  expr1 : (xb_gues+({d}-xb_gues)*(invlogit(xb_discrim*(xb_subj-xb_diff)))^xb_e
          )* cond(xb_gues<{d},1,.)
------------------------------------------------------------------------------
(1) Parameters are elements of the linear form xb_discrim.
(2) Parameters are elements of the linear form xb_diff.
(3) Parameters are elements of the linear form xb_gues.
(4) Parameters are elements of the linear form xb_e.
(5) Parameters are elements of the linear form xb_subj.

Bayesian Bernoulli model                         MCMC iterations  =     15,000
Random-walk Metropolis-Hastings sampling         Burn-in          =      5,000
                                                 MCMC sample size =     10,000
                                                 Number of obs    =      7,200
                                                 Acceptance rate  =      .3708
                                                 Efficiency:  min =    .007341
                                                              avg =     .02526
Log marginal likelihood =          .                          max =      .1517

file sim5pls.dta saved

. estimates store est5pls

. bayesstats summary {e:i.item} {d mu_a var_a mu_b var_b}

Posterior summary statistics                      MCMC sample size =    10,000
 
------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Mean   Std. Dev.     MCSE     Median  [95% Cred. Interval]
-------------+----------------------------------------------------------------
e            |
        item |
          1  |   .897859   .0578428   .006083   .8939272   .8050315   .9957951
          2  |  .9042669   .0585023   .005822     .90525   .8053789   .9956565
          3  |    .88993   .0562398   .005013    .887011    .803389   .9930454
          4  |  .9010241   .0574186   .006492   .9042044   .8030981   .9925598
          5  |  .9126369   .0545625    .00521   .9178927   .8098596   .9964487
          6  |  .9037269   .0583833   .006814   .9086704   .8054932   .9961268
          7  |  .9136308   .0558911   .005373   .9203899   .8112029    .996217
          8  |   .889775   .0568656   .005119   .8849938    .803912   .9938777
          9  |  .8808435    .056257   .004743   .8727194   .8030522   .9904972
-------------+----------------------------------------------------------------
           d |  .9671374   .0144004   .001165   .9670598   .9382404   .9933374
        mu_a |  .2770211   .1353777    .00832   .2782552   .0141125   .5418087
       var_a |   .122635   .0404159   .002148   .1160322   .0666951   .2208711
        mu_b |  .1211885   .1929743   .004955   .1199136  -.2515431    .503733
       var_b |  .5407642   .1747674   .006353   .5088269   .3016315   .9590086
------------------------------------------------------------------------------

We also want to compare the above model with a simpler one using a common asymmetry parameter \(e\).

. set seed 14

. bayesmh y, likelihood(dbernoulli(({gues:}+({d}-{gues:})*                  ///
>                            (invlogit({discrim:}*({subj:}-{diff:})))^{e})* ///
>                            cond({gues:}<{d},1,.)))                        ///
>         redefine(discrim:i.item) redefine(diff:i.item)                    ///
>         redefine(gues:i.item)    redefine(subj:i.id)                      ///
>         prior({subj:i.id},      normal(0, 1))                             ///
>         prior({discrim:i.item}, lognormal({mu_a}, {var_a}))               ///
>         prior({diff:i.item},    normal({mu_b}, {var_b}))                  ///
>         prior({gues:i.item},    igamma(10, 1))                            ///
>         prior({d} {e},          uniform(0.8, 1))                          ///
>         prior({mu_a} {mu_b},    normal(0, 0.1))                           ///
>         prior({var_a} {var_b},  igamma(10, 1))                            ///
>         block({mu_a mu_b var_a var_b d e}, split)                         ///
>         init({discrim:i.item} 1 {gues:i.item} 0.1 {d e} 0.9)              ///
>         exclude({subj:i.id}) burnin(5000) saving(sim5pl, replace) notable
  
Burn-in ...
Simulation ...

Model summary
------------------------------------------------------------------------------
Likelihood: 
  y ~ binomial(<expr1>,1)

Priors: 
  {discrim:i.item} ~ lognormal({mu_a},{var_a})                             (1)
     {diff:i.item} ~ normal({mu_b},{var_b})                                (2)
     {gues:i.item} ~ igamma(10,1)                                          (3)
       {subj:i.id} ~ normal(0,1)                                           (4)
             {d e} ~ uniform(0.8,1)

Hyperpriors: 
    {mu_a mu_b} ~ normal(0,0.1)
  {var_a var_b} ~ igamma(10,1)

Expression: 
  expr1 : (xb_gues+({d}-xb_gues)*(invlogit(xb_discrim*(xb_subj-xb_diff)))^{e})
          * cond(xb_gues<{d},1,.)
------------------------------------------------------------------------------
(1) Parameters are elements of the linear form xb_discrim.
(2) Parameters are elements of the linear form xb_diff.
(3) Parameters are elements of the linear form xb_gues.
(4) Parameters are elements of the linear form xb_subj.

Bayesian Bernoulli model                         MCMC iterations  =     15,000
Random-walk Metropolis-Hastings sampling         Burn-in          =      5,000
                                                 MCMC sample size =     10,000
                                                 Number of obs    =      7,200
                                                 Acceptance rate  =      .3805
                                                 Efficiency:  min =    .008179
                                                              avg =     .02768
Log marginal likelihood =          .                          max =     .08904

file sim5pl.dta saved

. estimates store est5pl

. bayesstats summary {e d mu_a var_a mu_b var_b}

Posterior summary statistics                      MCMC sample size =    10,000
 
------------------------------------------------------------------------------
             |                                                Equal-tailed
             |      Mean   Std. Dev.     MCSE     Median  [95% Cred. Interval]
-------------+----------------------------------------------------------------
           e |  .9118363   .0558178   .004194   .9175841   .8063153   .9960286
           d |  .9655166   .0147373   .001495   .9659029   .9354708   .9924492
        mu_a |  .2674271   .1368926   .008485    .270597   .0102798   .5443345
       var_a |  .1250759   .0428095   .002635   .1173619   .0654135   .2340525
        mu_b |  .1015121   .2048178   .006864    .103268  -.3052377   .4934158
       var_b |  .5677309   .1824591   .006981   .5331636   .3079868   1.016762
------------------------------------------------------------------------------

We use bayesstats ic to compare the DIC values of the two 5PL models:

. bayesstats ic est5pls est5pl, diconly

Deviance information criterion

------------------------
             |       DIC
-------------+----------
     est5pls |  8030.894
      est5pl |  8034.517
------------------------

The estimated DIC of the more complex est5pls model (8,031) is lower than the DIC of the simpler model (8,035), suggesting a better fit.

Back to table of contents

Conclusion

Finally, we compare all eight fitted models.

. bayesstats ic est1pl est2pl est3pl est3pls est4pl est4pls est5pl est5pls, ///
>         diconly

Deviance information criterion

------------------------
             |       DIC
-------------+----------
      est1pl |  8122.428
      est2pl |  8055.005
      est3pl |  8049.426
     est3pls |  8049.425
      est4pl |  8037.075
     est4pls |  8050.805
      est5pl |  8034.517
     est5pls |  8030.894
------------------------

The est5pls model has the lowest overall DIC. To confirm this result, we run another set of simulations with a larger MCMC sample size of 50,000. (We simply added the mcmcsize(50000) option to the bayesmh specification of the above eight models.) The following DIC values, based on the larger MCMC sample size, are more reliably estimated.

. bayesstats ic est1pl est2pl est3pl est3pls est4pl est4pls est5pl est5pls, ///
>         diconly

Deviance information criterion

------------------------
             |       DIC
-------------+----------
      est1pl |  8124.015
      est2pl |  8052.068
      est3pl |  8047.067
     est3pls |  8047.738
      est4pl |  8032.417
     est4pls |  8049.712
      est5pl |  8031.375
     est5pls |  8031.905
------------------------

Again, the 5PL models have the lowest DIC values and seem to provide the best fit. However, the DIC differences between models est4pl, est5pl, and est5pls are minimal and may very well be within the estimation error. Regardless, these three models appear to be better than the simpler 1PL, 2PL, and 3PL models.

Additional model checking may be needed to assess the models' fit, and we should not rely solely on the DIC values to make our final model selection. A practitioner may still prefer the simpler est4pl 4PL model to the 5PL models even though it has a slightly higher DIC. In fact, given that the posterior mean estimate of the upper asymptote parameter \(d\) is 0.96 with a 95% equal-tailed credible interval of (0.94, 0.99), some practitioners may prefer the even simpler est3pl 3PL model.

References

De Boeck, P., and M. Wilson, ed. 2004. Explanatory Item Response Models: A Generalized Linear and Nonlinear Approach. New York: Springer.

Kim, J.-S., and D. M. Bolt. 2007. Estimating item response theory models using Markov chain Monte Carlo methods. Educational Measurement: Issues and Practice 26: 38-51.