Assessing the Dimensionality of the Latent Attribute Space in Cognitive Diagnosis Through Testing for Conditional Independence

Lim, Youn Seon; Drasgow, Fritz

doi:10.1007/978-3-030-01310-3_17

Youn Seon Lim⁶ &
Fritz Drasgow^7,8

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 265))

Included in the following conference series:

1117 Accesses
1 Citations

Abstract

Cognitive diagnosis seeks to assess an examinee’s mastery of a set of cognitive skills called (latent) attributes. The entire set of attributes characterizing a particular ability domain is often referred to as the latent attribute space. The correct specification of the latent attribute space is essential in cognitive diagnosis because misspecifications of the latent attribute space result in inaccurate parameter estimates, and ultimately, in the incorrect assessment of examinees’ ability. Misspecifications of the latent attribute space typically lead to violations of conditional independence. In this article, the Mantel-Haenszel statistic (Lim & Drasgow in J Classif, 2019) is implemented to detect possible misspecifications of the latent attribute space by checking for conditional independence of the items of a test with parametric cognitive diagnosis models. The performance of the Mantel-Haenszel statistic is evaluated in simulation studies based on its Type-I-error rate and power.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Insights from Reparameterized DINA and Beyond

Nonparametric Cognitive Diagnosis When Attributes Are Polytomous

Article 11 January 2024

Conditional Independence and Dimensionality of Cognitive Diagnostic Models: a Test for Model Fit

Article 29 March 2019

Keywords

1 Introduction

Cognitive diagnosis models (CDMs) try to account for the dependence among observations by latent dimensions that are related to the mastery or possession of cognitive skills, or “attributes” required for a correct response to an item. These models have received considerable attention in educational research because tests based on CDMs promise to provide more diagnostic information about an examinee’s ability than tests that are based on Item Response Theory (IRT) (Rupp et al., 2010). Specifically, whereas IRT defines ability as a unidimensional continuous construct, CDMs describe ability as a composite of K discrete, binary latent skill variables called attributes that define $2^K$ distinct classes of proficiency.

Like with other measurement models in assessment, the validity of a CDM depends on whether the latent attributes characterizing each proficiency class entirely determine an examinee’s test performance, so that item responses can be assumed to be independent after controlling for the effect of the attributes. (This property of conditional independence is often called “local independence” in the IRT literature.) As Lord and Novick (1968) pointed out, the misspecification of the latent ability space underlying a test usually leads to violations of the conditional independence assumption that, in turn, result in inaccurate estimates of the model parameters and, ultimately, incorrect assessments of examinees’ ability. For cognitive diagnosis, the assumption of conditional independence is equivalent to the assumption that the K attributes span the complete latent space. More to the point, violations of conditional independence are likely to occur if the latent attribute space has been misspecified in either including too few or too many latent attributes in the model.

Within the context of IRT models, various methods have been proposed for examining the dimensionality of the latent ability space underlying a test through checking for possible violations of conditional independence. Stout (1987), for example, developed DIMTEST, a nonparametric procedure for establishing unidimensionality of the test items through testing for conditional independence. Another instance is Rosenbaum’s (1984) use of the Mantel-Haenszel statistic for assessing the unidimensionality of dichotomous items.

Lim and Drasgow (2019) proposed a nonparametric procedure for detecting misspecifications of the latent attribute space in cognitive diagnosis, which relies on the Mantel-Haenszel statistic to check for violations of conditional independence in the context of nonparametric cognitive diagnosis method approaches. This study extends the study of Lim and Drasgow (2019) by using the proposed statistic with parametric cognitive models for the estimation of proficiency classes.

2 The Mantel-Haenszel Test

Lim and Drasgow (2019) propose to use the Mantel-Haenszel (MH) chi-square statistic to test for the (conditional) independence of two dichotomous variables j and $j^\prime $ by forming the 2-by-2 contingency tables in conditioning on the levels of the stratification variable C. In their study, the stratification variable C is defined in terms of the latent attribute vector $\mathbf {\alpha }_c = (\alpha _{c1}, \alpha _{c2}, ..., \alpha _{cK})',$ for $c=1, 2, ..., 2^K$; that is, the different strata of C are formed by the $2^K$ proficiency classes.

Let $\{i_{j,j^\prime c}\}$ denote the frequencies of examinees in the $2 \times 2 \times C$ contingency table. The marginal frequencies are the row totals $\{i_{1+c}\}$ and the column totals $\{i_{+1c}\},$ and ${i_{++c}}$ represents the total sample size in the cth stratum. Then, the MH statistic is defined as

$$\begin{aligned} \text {MH} \chi ^2 = \displaystyle \frac{[\sum _{c} (i_{11c} - \sum _{c} E(i_{11c})]^2}{\sum _{c} \text {var}(i_{11c})}, \end{aligned}$$

(1)

where $E(i_{11c}) = i_{1+c} i_{+1c} / i_{++c}$ and $\text {var} (i_{11c}) = i_{0+c} i_{1+c} i_{+0c} i_{+1c} / i_{++c}^2 (i_{++c} - 1). $ The stratum having minimum total sample size $i_{++c}$ equal or larger than 1 is included. Under the null hypothesis of conditional independence of the items j and $j^\prime $, for cognitive diagnosis models, the MH statistic has approximately a chi-square distribution with degrees of freedom equal to 1 if examinee’s true latent attribute vectors are used as the levels of the stratification variable C. Assume that the odds ratio between j and $j'$ is constant across all strata. Then the null hypothesis of independence is equivalent to an odds ratio of one

$$\begin{aligned} \text {Odds Ratio}_{\text {MH}j,j^\prime } = \displaystyle \frac{1}{C} \sum _{c = 1}^C \text {or}_{j,j^\prime c}, \end{aligned}$$

(2)

where $\text {or}_{j,j^\prime c} = (i_{11c} i_{00c}) /(i_{10c} i_{01c}).$

3 Simulation Studies

The finite test-length and sample-size properties of $\text {MH} \chi ^2$ have been investigated in simulation studies. For each condition, item response data of sample sizes I = 500, or 2000 were drawn from a discretized multivariate normal distribution ${{ MVN} {(0_{K}, {\sum })}},$ where the covariance matrix ${\sum }$ has unit variance and common correlation ${\rho }$ = 0.3 or 0.6. The K-dimensional continuous vectors ${\mathbf {\theta }_{ i}} = ({{\theta }_{ i1}}, {{\theta }_{ i2}}, ..., {{\theta }_{ iK}})^{\prime }$ were dichotomized according to

$${\alpha _{ ik}} = {\left\{ \begin{array}{ll} 1, &{} \text{ if } \; {{\theta }_{ ik}} \ge {{\varPhi }^{-1}} {\frac{k}{K+1}}; \\ 0, &{} \text{ otherwise } \end{array}\right. } $$

Test lengths J = 20 or 40 were studied with attribute vectors of length K= 3 or 5. The correctly specified Q-matrix for J = 20 is presented in Table 1 (Attributes with $\star $ were used for Q-matrix (K = 3); attributes with $\star \star $ for Items 4 and 5). The Q-matrix for J = 40 was obtained by duplicating this matrix two times.

Data were generated from three different models: the DINA model, the additive-cognitive diagnosis model (A-CDM), and a saturated model (i.e., the generalized-DINA (G-DINA) model). For the DINA model, item parameters were drawn from Uniform (0, 0.3). For the A-CDM and the saturated model, like Chen et al. (2013), the parameters were restricted as $P(\alpha ^\star _{ij})_{\min } = 0.10$ and $P(\alpha ^\star _{ij})_{\max } = 0.90$, where $\alpha ^\star _{ij}$ was the reduced attribute vector whose components are the required attributes for the $j-{th}$ item (see de la Torre, 2011, more details). The R was used for the estimation in this study (e.g., Robitzsch, Kiefer, George, & Uenlue, 2015) in which model parameter estimation was performed by maximization of the marginal likelihood.

Table 1 Correctly specified Q (K = 5)

Full size table

For each condition, a set of item response vectors was simulated for 100 replications. The proposed MH statistic, Chi-squared statistic $x_{jj'}$ (Chen and Thissen, 1997), absolute deviations of observed and predicted corrections $r_{jj'}$ (Chen et al. 2013), and their corresponding p-values were computed for all $(J \times (J - 1)) / 2$ item-pairs in an individual replication.

4 Results

Across 100 trials for each condition, the proportion of times the p-value of each item-pair was smaller than the significance level 0.05 was recorded and is summarized in the tables shown below.

Type I Error Study In this simulation study, the correctly specified Q-matrices (K = 5, or K = 3) were used to fit the data to examine type I error rates. Table 2 shows that most type I error rates of the three different statistics were around the nominal significance level 0.05. The Chi-squared test statistic $x_{jj^\prime }$ was conservative, with type I error rates below 0.024. The MH statistic got consistent under all conditions when item J = 40, confirming the asymptotic consistency. In the condition of K= 5, J = 20, and I = 2000, the type I error rates of the MH test slightly increased over the nominal rate in the A-CDM and the saturated model for the difficulty of correct classification.

Table 2 Type I error study

Full size table

Table 3 Power study: 20% misspecified Q.

Full size table

Power Study: 20% misspecified Q-matrix For each replication, 20% of $q_{jk}$ entries of the correctly specified Q-matrices (K = 5, or K = 3) were randomly misspecified. It is over-specification when q-entries of 0 are incorrectly coded as 1, and it is underspecification when q-entries of 1 are incorrectly coded as 0. Table 3 shows that the average rejection rates of all $J \times (J-1) \times 1/2 $ item pairs result in relatively low in the MH test (i.e., 0.310 or below in Non-Parametric Model, 373 or below in DINA model, 0.258 or below in A-CDM, 0.270 or below in saturated model). When K = 5, and I = 500, the power rates appear to be low (i.e., 0.074 or below) in the A-CDM, and the saturated model. They are rather complex models. It is very likely for small sample size to increase the difficulty of accurate model estimation.

Power Study: Over-specified $\varvec{Q}$-matrix For each replication, a data set was generated with the Q-matrix (K = 3) that is embedded as a subset of the Q-matrix (K = 5) in Table 1. The data was fitted with the Q-matrix (K = 5) to over-specify the correctly specified Q-matrix (K = 3). A dimension (total 9 items) or two dimensions (total 4 items) were over-specified. The results were consistent with what Chen et al. (2013) found.

As Table 4 shows, all statistics were insensitive to over-specified Q-matrices when the true models were the saturated model or the A-CDM. The average power rates of the item pairs where both items were over-specified in the same dimension were Non-Parametric Model = 0.074, MH = 0.052, $x_{jj^\prime }$ = 0.181, and $r_{jj^\prime }$ = 0.220, and those of the item pairs where either item was over-specified were MH = 0.058, $x_{jj^\prime }$ = 0.104, and $r_{jj^\prime }$ = 0.137 when the true model was the DINA model. If more attributes are included in the Q-matrix than required, as Rupp et al. (2010) indicated, conditional independence may still be preserved, because true attribute vector may be embedded in subcomponents of the modeled vector, resulting in a model that is too complex but preserves conditional independence. This finding implies that unlike the other statistics, the MH statistic is inappropriate to be used for the detection of the over-specified Q-matrices when the true model is the DINA model.

Table 4 Power study: Over-specified Q with true K = 5.

Full size table

Power Study: Under-specified $\varvec{Q}$-matrix A data set was generated with the Q-matrix (K = 5) in Table 1. The data was fitted with the embedded Q-matrix (K = 3) in each replication. A dimension (total 9 items) or two dimensions (total 4 items) were under-specified. The average power rates of the item pairs where both items were under-specified in the same dimension were MH = 0.572, $x_{jj^\prime }$ = 0.669, and $r_{jj^\prime }$ = 0.735, with power relatively consistent across all conditions as shown in Table 5. The average rejection rates across item pairs where either item was under-specified were MH = 0.124, $x_{jj^\prime }$ = 0.144, and $r_{jj^\prime }$ = 0.201. The power rates slightly increased when J = 40, I = 2000, or the true model is the A-CDM in all statistics. Taking this finding into account, like the other statistics, the MH test is sensitive to Q-underspecification and has high power in all conditions.

Table 5 Power study: under-specified Q with true K=6

Full size table

Table 6 Power study: model misspecification

Full size table

Power Study: Model Misspecification In this simulation study, a correctly specified Q-matrix $(K = 3 \ \text {or} \ 5)$ was used, but with a misspecified cognitive diagnosis models. As Chen et al. (2013) indicated, no statistics detected the model misspecification in all conditions when the fitted model was the saturated model, and the true models were the DINA model and the A-CDM (i.e., 0.052 or below for MH, 0.024 or below for $x_{jj'}$, and 0.059 or below for $r_{jj'}$). Due to limited space, the output is not included. The results in Table 6 show that the rejection rates of the MH statistic were low (i.e., 0.186 or below with few exceptions when the true model was the DINA model, and the fitted model was the A-CDM, 0.097 or below with few exceptions when verse versa). When the true model was the A-CDM, and the fitted model was the DINA model, the power rates were even lower because the DINA model is simper than the A-CDM.

5 Discussion

A Mantel-Haenszel(MH) statistic proposed by Lim and Drasgow (2019) was evaluated for detecting misspecifications of the latent attribute space in parametric cognitive diagnosis models; that is, the Q-matrix might contain too many or too few latent attributes. (Recall that a misspecified latent attribute space may result in inaccurate parameter estimates that will cause incorrect assessments of examinees’ ability.) The proposed MH statistic uses as the levels of the stratification variable the different proficiency classes, with examinees’ individual attribute vectors—that identify proficiency class membership—estimated from the data. Simulation studies were conducted for investigating the diagnostic sensitivity of the MH statistic in terms of Type-I-Error rate and power under a variety of testing conditions. Across different sample sizes, test lengths, number of attributes defining the true attribute space, and levels of correlation between the attributes, the MH statistic consistently attained a Type-I-Error rate that was typically close to the nominal $0.05-\alpha $-level when the data were generated using the true Q-matrix based on the correctly specified latent attribute space. When the data were generated using a Q-matrix based on an under-specified latent attribute space, the MH statistic displayed moderate power in detecting the resulting conditional dependence among test items. In summary, the MH statistic might be a promising tool for uncovering possible misspecifications of the latent attribute space in cognitive diagnosis. Further research is needed to investigate the specific factors that affect the power of the MH statistic; especially, when the latent attribute space has been over-specified (i.e., too many attributes have been included).

References

Chen, J., de la Torre, J., & Zhang, Z. (2013). Relative and absolute fit evaluation in cognitive diagnosis modeling. Journal of Educational Measurement, 50, 123–140.
Article Google Scholar
Chen, W., & Thissen, D. (1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265–289.
Article Google Scholar
de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76, 179–199.
Article MathSciNet MATH Google Scholar
Lim, Y. S., & Drasgow, F. (2019). Conditional independence and dimensionality of nonparametric cognitive diagnostic models: A test for model fit. Journal of,. Classification.
Google Scholar
Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.
MATH Google Scholar
Mantel, N., & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of National Cancer Institute, 22, 719–748.
Google Scholar
Robitzsch, A., Kiefer, T., George, A. C., & Uenlue, A. (2015). CDM: Cognitive diagnostic modeling. R package version 3.4–21.
Google Scholar
Rosenbaum, P. R. (1984). Testing the conditional independence and monotonicity assumption of item response theory. Psychometrika, 49, 425–436.
Article MathSciNet MATH Google Scholar
Rupp, A., Templin, J., & Henson, R. (2010). Diagnostic assessment: Theory, methods, and applications. New York: Guilford.
Google Scholar
Stout, W. (1987). A nonparametric approach for assessing latent trait dimensionality. Psychometrika, 52, 589–618.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Science Education, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, 500 Hofstra University, Hempstead, NY, 11549, USA
Youn Seon Lim
School of Labor & Employment Relations, Department of Psychology, 504 E. Armory Avenue, Champaign, IL, 61820, USA
Fritz Drasgow
University of Illinois at Urbana-Champaign, 603 E. Daniel Street, Champaign, IL, 61820, USA
Fritz Drasgow

Authors

Youn Seon Lim
View author publications
You can also search for this author in PubMed Google Scholar
Fritz Drasgow
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Youn Seon Lim .

Editor information

Editors and Affiliations

Department of Statistics, Umeå School of Business, Economics and Statistics, Umeå University, Umeå, Sweden
Marie Wiberg
Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, IL, USA
Steven Culpepper
Faculty of Psychology and Educational Sciences, KU Leuven, Leuven, Belgium
Rianne Janssen
Facultad de Matematicas, Pontificia Universidad Catolica de Chile, Santiago, Chile
Jorge González
Department of Psychology, University of Amsterdam, Amsterdam, The Netherlands
Dylan Molenaar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lim, Y.S., Drasgow, F. (2019). Assessing the Dimensionality of the Latent Attribute Space in Cognitive Diagnosis Through Testing for Conditional Independence. In: Wiberg, M., Culpepper, S., Janssen, R., González, J., Molenaar, D. (eds) Quantitative Psychology. IMPS IMPS 2017 2018. Springer Proceedings in Mathematics & Statistics, vol 265. Springer, Cham. https://doi.org/10.1007/978-3-030-01310-3_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-01310-3_17
Published: 18 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01309-7
Online ISBN: 978-3-030-01310-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Assessing the Dimensionality of the Latent Attribute Space in Cognitive Diagnosis Through Testing for Conditional Independence

Abstract

Similar content being viewed by others

Insights from Reparameterized DINA and Beyond

Nonparametric Cognitive Diagnosis When Attributes Are Polytomous

Conditional Independence and Dimensionality of Cognitive Diagnostic Models: a Test for Model Fit

Keywords

1 Introduction

2 The Mantel-Haenszel Test

3 Simulation Studies

4 Results

5 Discussion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Assessing the Dimensionality of the Latent Attribute Space in Cognitive Diagnosis Through Testing for Conditional Independence

Abstract

Similar content being viewed by others

Insights from Reparameterized DINA and Beyond

Nonparametric Cognitive Diagnosis When Attributes Are Polytomous

Conditional Independence and Dimensionality of Cognitive Diagnostic Models: a Test for Model Fit

Keywords

1 Introduction

2 The Mantel-Haenszel Test

3 Simulation Studies

4 Results

5 Discussion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation