Computes the M2 (Maydeu-Olivares & Joe, 2006) statistic when all data are dichotomous,
the collapsed M2* statistic (collapsing over univariate and bivariate response categories;
see Cai and Hansen, 2013), and the hybrid C2 statistic which only collapses only the bivariate
moments (Cai and Monro, 2014). The C2 variant is mainly useful when polytomous response models
do not have sufficient degrees of freedom to compute M2*. This function
also computes associated fit indices that are based on
fitting the null model. Supports single and multiple-group models.
If the latent trait density was approximated (e.g., Davidian curves, Empirical histograms, etc)
then passing use_dentype_estimate = TRUE
will use the internally saved quadrature and
density components (where applicable).
M2(
obj,
type = "M2*",
calcNull = TRUE,
na.rm = FALSE,
quadpts = NULL,
theta_lim = c(-6, 6),
CI = 0.9,
residmat = FALSE,
QMC = FALSE,
suppress = 1,
...
)
an estimated model object from the mirt package
type of fit statistic to compute. Options are "M2", "M2*" for the univariate and bivariate collapsed version of the M2 statistic ("M2" currently limited to dichotomous response data only), and "C2" for a hybrid between M2 and M2* where only the bivariate moments are collapsed
logical; calculate statistics for the null model as well?
Allows for statistics such as the limited information TLI and CFI. Only valid when items all
have a suitable null model (e.g., those created via createItem
will not)
logical; remove rows with any missing values? The M2 family of statistics requires a complete dataset in order to be well defined
number of quadrature points to use during estimation. If NULL
,
a suitable value will be chosen based
on the rubric found in fscores
lower and upper range to evaluate latent trait integral for each dimension
numeric value from 0 to 1 indicating the range of the confidence interval for RMSEA. Default returns the 90% interval
logical; return the residual matrix used to compute the SRMSR statistic? Only the lower triangle of the residual correlation matrix will be returned (the upper triangle is filled with NA's)
logical; use quasi-Monte Carlo integration? Useful for higher dimensional models.
If quadpts
not specified, 5000 nodes are used by default
a numeric value indicating which parameter residual dependency combinations
to flag as being too high. Absolute values for the standardized residuals greater than
this value will be returned, while all values less than this value will be set to NA.
Must be used in conjunction with the argument residmat = TRUE
additional arguments to pass
Returns a data.frame object with the M2-type statistic, along with the degrees of freedom,
p-value, RMSEA (with 90% confidence interval), SRMSR for each group (if all items were ordinal),
and optionally the TLI and CFI model fit statistics if calcNull = TRUE
.
Cai, L. & Hansen, M. (2013). Limited-information goodness-of-fit testing of hierarchical item factor models. British Journal of Mathematical and Statistical Psychology, 66, 245-276.
Cai, L. & Monro, S. (2014). A new statistic for evaluating item response theory models for ordinal data. National Center for Research on Evaluation, Standards, & Student Testing. Technical Report.
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 1-29. doi:10.18637/jss.v048.i06
Maydeu-Olivares, A. & Joe, H. (2006). Limited information goodness-of-fit testing in multidimensional contingency tables. Psychometrika, 71, 713-732.
# \donttest{
dat <- as.matrix(expand.table(LSAT7))
(mod1 <- mirt(dat, 1))
#>
#> Call:
#> mirt(data = dat, model = 1)
#>
#> Full-information item factor analysis with 1 factor(s).
#> Converged within 1e-04 tolerance after 28 EM iterations.
#> mirt version: 1.40
#> M-step optimizer: BFGS
#> EM acceleration: Ramsay
#> Number of rectangular quadrature: 61
#> Latent density type: Gaussian
#>
#> Log-likelihood = -2658.805
#> Estimated parameters: 10
#> AIC = 5337.61
#> BIC = 5386.688; SABIC = 5354.927
#> G2 (21) = 31.7, p = 0.0628
#> RMSEA = 0.023, CFI = NaN, TLI = NaN
M2(mod1)
#> M2 df p RMSEA RMSEA_5 RMSEA_95 SRMSR
#> stats 11.93769 5 0.03565165 0.0372683 0.008950922 0.06496573 0.03195919
#> TLI CFI
#> stats 0.9369332 0.9684666
resids <- M2(mod1, residmat=TRUE) #lower triangle of residual correlation matrix
resids
#> Item.1 Item.2 Item.3 Item.4 Item.5
#> Item.1 NA NA NA NA NA
#> Item.2 -0.02212010 NA NA NA NA
#> Item.3 -0.03265942 0.03335601 NA NA NA
#> Item.4 0.05155184 -0.01642572 -0.012476892 NA NA
#> Item.5 0.05443241 -0.03867705 -0.001860395 7.235426e-05 NA
summary(resids[lower.tri(resids)])
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> -0.038677 -0.020697 -0.007169 0.001519 0.025035 0.054432
# M2 with missing data present
dat[sample(1:prod(dim(dat)), 250)] <- NA
mod2 <- mirt(dat, 1)
# Compute stats by removing missing data row-wise
M2(mod2, na.rm = TRUE)
#> Sample size after row-wise response data removal: 776
#> M2 df p RMSEA RMSEA_5 RMSEA_95 SRMSR
#> stats 11.43379 5 0.0434261 0.0407472 0.006534177 0.07239755 0.03644538
#> TLI CFI
#> stats 0.9177292 0.9588646
# C2 statistic (useful when polytomous IRT models have too few df)
pmod <- mirt(Science, 1)
# This fails with too few df:
# M2(pmod)
# This, however, works:
M2(pmod, type = 'C2')
#> M2 df p RMSEA RMSEA_5 RMSEA_95 SRMSR
#> stats 19.17929 2 6.84337e-05 0.1482174 0.09234204 0.2116368 0.07257313
#> TLI CFI
#> stats 0.7300952 0.9100317
# }