Computes the M2 (Maydeu-Olivares & Joe, 2006) statistic when all data are dichotomous, the collapsed M2* statistic (collapsing over univariate and bivariate response categories; see Cai and Hansen, 2013), and the hybrid C2 statistic which only collapses only the bivariate moments (Cai and Monro, 2014). The C2 variant is mainly useful when polytomous response models do not have sufficient degrees of freedom to compute M2*. This function also computes associated fit indices that are based on fitting the null model. Supports single and multiple-group models. If the latent trait density was approximated (e.g., Davidian curves, Empirical histograms, etc) then passing use_dentype_estimate = TRUE will use the internally saved quadrature and density components (where applicable).

M2(
  obj,
  type = "M2*",
  calcNull = TRUE,
  na.rm = FALSE,
  quadpts = NULL,
  theta_lim = c(-6, 6),
  CI = 0.9,
  residmat = FALSE,
  QMC = FALSE,
  suppress = 1,
  ...
)

Arguments

obj

an estimated model object from the mirt package

type

type of fit statistic to compute. Options are "M2", "M2*" for the univariate and bivariate collapsed version of the M2 statistic ("M2" currently limited to dichotomous response data only), and "C2" for a hybrid between M2 and M2* where only the bivariate moments are collapsed

calcNull

logical; calculate statistics for the null model as well? Allows for statistics such as the limited information TLI and CFI. Only valid when items all have a suitable null model (e.g., those created via createItem will not)

na.rm

logical; remove rows with any missing values? The M2 family of statistics requires a complete dataset in order to be well defined

quadpts

number of quadrature points to use during estimation. If NULL, a suitable value will be chosen based on the rubric found in fscores

theta_lim

lower and upper range to evaluate latent trait integral for each dimension

CI

numeric value from 0 to 1 indicating the range of the confidence interval for RMSEA. Default returns the 90% interval

residmat

logical; return the residual matrix used to compute the SRMSR statistic? Only the lower triangle of the residual correlation matrix will be returned (the upper triangle is filled with NA's)

QMC

logical; use quasi-Monte Carlo integration? Useful for higher dimensional models. If quadpts not specified, 5000 nodes are used by default

suppress

a numeric value indicating which parameter residual dependency combinations to flag as being too high. Absolute values for the standardized residuals greater than this value will be returned, while all values less than this value will be set to NA. Must be used in conjunction with the argument residmat = TRUE

...

additional arguments to pass

Value

Returns a data.frame object with the M2-type statistic, along with the degrees of freedom, p-value, RMSEA (with 90% confidence interval), SRMSR for each group (if all items were ordinal), and optionally the TLI and CFI model fit statistics if calcNull = TRUE.

References

Cai, L. & Hansen, M. (2013). Limited-information goodness-of-fit testing of hierarchical item factor models. British Journal of Mathematical and Statistical Psychology, 66, 245-276.

Cai, L. & Monro, S. (2014). A new statistic for evaluating item response theory models for ordinal data. National Center for Research on Evaluation, Standards, & Student Testing. Technical Report.

Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 1-29. doi:10.18637/jss.v048.i06

Maydeu-Olivares, A. & Joe, H. (2006). Limited information goodness-of-fit testing in multidimensional contingency tables. Psychometrika, 71, 713-732.

Author

Phil Chalmers rphilip.chalmers@gmail.com

Examples

# \donttest{
dat <- as.matrix(expand.table(LSAT7))
(mod1 <- mirt(dat, 1))
#> 
#> Call:
#> mirt(data = dat, model = 1)
#> 
#> Full-information item factor analysis with 1 factor(s).
#> Converged within 1e-04 tolerance after 28 EM iterations.
#> mirt version: 1.40 
#> M-step optimizer: BFGS 
#> EM acceleration: Ramsay 
#> Number of rectangular quadrature: 61
#> Latent density type: Gaussian 
#> 
#> Log-likelihood = -2658.805
#> Estimated parameters: 10 
#> AIC = 5337.61
#> BIC = 5386.688; SABIC = 5354.927
#> G2 (21) = 31.7, p = 0.0628
#> RMSEA = 0.023, CFI = NaN, TLI = NaN
M2(mod1)
#>             M2 df          p     RMSEA     RMSEA_5   RMSEA_95      SRMSR
#> stats 11.93769  5 0.03565165 0.0372683 0.008950922 0.06496573 0.03195919
#>             TLI       CFI
#> stats 0.9369332 0.9684666
resids <- M2(mod1, residmat=TRUE) #lower triangle of residual correlation matrix
resids
#>             Item.1      Item.2       Item.3       Item.4 Item.5
#> Item.1          NA          NA           NA           NA     NA
#> Item.2 -0.02212010          NA           NA           NA     NA
#> Item.3 -0.03265942  0.03335601           NA           NA     NA
#> Item.4  0.05155184 -0.01642572 -0.012476892           NA     NA
#> Item.5  0.05443241 -0.03867705 -0.001860395 7.235426e-05     NA
summary(resids[lower.tri(resids)])
#>      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
#> -0.038677 -0.020697 -0.007169  0.001519  0.025035  0.054432 

# M2 with missing data present
dat[sample(1:prod(dim(dat)), 250)] <- NA
mod2 <- mirt(dat, 1)
# Compute stats by removing missing data row-wise
M2(mod2, na.rm = TRUE)
#> Sample size after row-wise response data removal: 776
#>             M2 df         p     RMSEA     RMSEA_5   RMSEA_95      SRMSR
#> stats 11.43379  5 0.0434261 0.0407472 0.006534177 0.07239755 0.03644538
#>             TLI       CFI
#> stats 0.9177292 0.9588646

# C2 statistic (useful when polytomous IRT models have too few df)
pmod <- mirt(Science, 1)
# This fails with too few df:
# M2(pmod)
# This, however, works:
M2(pmod, type = 'C2')
#>             M2 df           p     RMSEA    RMSEA_5  RMSEA_95      SRMSR
#> stats 19.17929  2 6.84337e-05 0.1482174 0.09234204 0.2116368 0.07257313
#>             TLI       CFI
#> stats 0.7300952 0.9100317

# }