Compute the M2 model fit statistic

Computes the M2 (Maydeu-Olivares & Joe, 2006) statistic when all data are dichotomous, the collapsed M2* statistic (collapsing over univariate and bivariate response categories; see Cai and Hansen, 2013), and the hybrid C2 statistic which only collapses only the bivariate moments (Cai and Monro, 2014). The C2 variant is mainly useful when polytomous response models do not have sufficient degrees of freedom to compute M2*. This function also computes associated fit indices that are based on fitting the null model. Supports single and multiple-group models. If the latent trait density was approximated (e.g., Davidian curves, Empirical histograms, etc) then passing use_dentype_estimate = TRUE will use the internally saved quadrature and density components (where applicable).

Usage

M2(
  obj,
  type = "M2*",
  calcNull = TRUE,
  quadpts = NULL,
  theta_lim = c(-6, 6),
  CI = 0.9,
  residmat = FALSE,
  QMC = FALSE,
  suppress = 1,
  ...
)

Arguments

obj: an estimated model object from the mirt package
type: type of fit statistic to compute. Options are "M2", "M2*" for the univariate and bivariate collapsed version of the M2 statistic ("M2" currently limited to dichotomous response data only), and "C2" for a hybrid between M2 and M2* where only the bivariate moments are collapsed
calcNull: logical; calculate statistics for the null model as well? Allows for statistics such as the limited information TLI and CFI. Only valid when items all have a suitable null model (e.g., those created via createItem will not)
quadpts: number of quadrature points to use during estimation. If NULL, a suitable value will be chosen based on the rubric found in fscores
theta_lim: lower and upper range to evaluate latent trait integral for each dimension
CI: numeric value from 0 to 1 indicating the range of the confidence interval for RMSEA. Default returns the 90% interval
residmat: logical; return the residual matrix used to compute the SRMSR statistic? Only the lower triangle of the residual correlation matrix will be returned (the upper triangle is filled with NA's)
QMC: logical; use quasi-Monte Carlo integration? Useful for higher dimensional models. If quadpts not specified, 5000 nodes are used by default
suppress: a numeric value indicating which parameter residual dependency combinations to flag as being too high. Absolute values for the standardized residuals greater than this value will be returned, while all values less than this value will be set to NA. Must be used in conjunction with the argument residmat = TRUE
...: additional arguments to pass

Value

Returns a data.frame object with the M2-type statistic, along with the degrees of freedom, p-value, RMSEA (with 90% confidence interval), SRMSR for each group, and optionally the TLI and CFI model fit statistics if calcNull = TRUE.

References

Cai, L. & Hansen, M. (2013). Limited-information goodness-of-fit testing of hierarchical item factor models. British Journal of Mathematical and Statistical Psychology, 66, 245-276.

Cai, L. & Monro, S. (2014). A new statistic for evaluating item response theory models for ordinal data. National Center for Research on Evaluation, Standards, & Student Testing. Technical Report.

Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 1-29. doi:10.18637/jss.v048.i06

Maydeu-Olivares, A. & Joe, H. (2006). Limited information goodness-of-fit testing in multidimensional contingency tables. Psychometrika, 71, 713-732.

Author

Phil Chalmers rphilip.chalmers@gmail.com

Examples

# \donttest{
dat <- as.matrix(expand.table(LSAT7))
(mod1 <- mirt(dat, 1))
#> 
#> Call:
#> mirt(data = dat, model = 1)
#> 
#> Full-information item factor analysis with 1 factor(s).
#> Converged within 1e-04 tolerance after 28 EM iterations.
#> mirt version: 1.43 
#> M-step optimizer: BFGS 
#> EM acceleration: Ramsay 
#> Number of rectangular quadrature: 61
#> Latent density type: Gaussian 
#> 
#> Log-likelihood = -2658.805
#> Estimated parameters: 10 
#> AIC = 5337.61
#> BIC = 5386.688; SABIC = 5354.927
#> G2 (21) = 31.7, p = 0.0628
#> RMSEA = 0.023, CFI = NaN, TLI = NaN
M2(mod1)
#>             M2 df          p     RMSEA     RMSEA_5   RMSEA_95      SRMSR
#> stats 11.93769  5 0.03565165 0.0372683 0.008950922 0.06496573 0.03195919
#>             TLI       CFI
#> stats 0.9369332 0.9684666
resids <- M2(mod1, residmat=TRUE) #lower triangle of residual correlation matrix
resids
#>             Item.1      Item.2       Item.3       Item.4 Item.5
#> Item.1          NA          NA           NA           NA     NA
#> Item.2 -0.02212010          NA           NA           NA     NA
#> Item.3 -0.03265942  0.03335601           NA           NA     NA
#> Item.4  0.05155184 -0.01642572 -0.012476892           NA     NA
#> Item.5  0.05443241 -0.03867705 -0.001860395 7.235425e-05     NA
summary(resids[lower.tri(resids)])
#>      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
#> -0.038677 -0.020697 -0.007169  0.001519  0.025035  0.054432 

# M2 with missing data present
dat[sample(1:prod(dim(dat)), 250)] <- NA
mod2 <- mirt(dat, 1)
M2(mod2)
#>             M2 df          p      RMSEA     RMSEA_5   RMSEA_95      SRMSR
#> stats 11.13191  5 0.04882643 0.03503726 0.002358469 0.06304675 0.03247527
#>             TLI       CFI
#> stats 0.9350322 0.9675161

# C2 statistic (useful when polytomous IRT models have too few df)
pmod <- mirt(Science, 1)
# This fails with too few df:
# M2(pmod)
# This, however, works:
M2(pmod, type = 'C2')
#>             M2 df           p     RMSEA    RMSEA_5  RMSEA_95      SRMSR
#> stats 19.17929  2 6.84337e-05 0.1482174 0.09234204 0.2116368 0.07257313
#>             TLI       CFI
#> stats 0.7300952 0.9100317

# }