mirt {mirt} | R Documentation |
mirt
fits a maximum likelihood (or maximum a posteriori) factor analysis model
to any mixture of dichotomous and polytomous data under the item response theory paradigm
using either Cai's (2010) Metropolis-Hastings Robbins-Monro (MHRM) algorithm, with
an EM algorithm approach outlined by Bock and Aitkin (1981) using rectangular or
quasi-Monte Carlo integration grids, or with the stochastic EM (i.e., the first two stages
of the MH-RM algorithm). Models containing 'explanatory' person or item level predictors
can only be included by using the mixedmirt
function, though latent
regression models can be fit using the formula
input in this function.
Tests that form a two-tier or bi-factor structure should be estimated with the
bfactor
function, which uses a dimension reduction EM algorithm for
modeling item parcels. Multiple group analyses (useful for DIF and DTF testing) are
also available using the multipleGroup
function.
mirt(
data,
model = 1,
itemtype = NULL,
guess = 0,
upper = 1,
SE = FALSE,
covdata = NULL,
formula = NULL,
SE.type = "Oakes",
method = "EM",
optimizer = NULL,
dentype = "Gaussian",
pars = NULL,
constrain = NULL,
calcNull = FALSE,
draws = 5000,
survey.weights = NULL,
quadpts = NULL,
TOL = NULL,
gpcm_mats = list(),
grsm.block = NULL,
rsm.block = NULL,
monopoly.k = 1L,
key = NULL,
large = FALSE,
GenRandomPars = FALSE,
accelerate = "Ramsay",
verbose = TRUE,
solnp_args = list(),
nloptr_args = list(),
spline_args = list(),
control = list(),
technical = list(),
...
)
data |
a |
model |
a string to be passed (or an object returned from) |
itemtype |
type of items to be modeled, declared as a vector for each item or a single value
which will be recycled for each item. The
Additionally, user defined item classes can also be defined using the |
guess |
fixed pseudo-guessing parameters. Can be entered as a single value to assign a global guessing parameter or may be entered as a numeric vector corresponding to each item |
upper |
fixed upper bound parameters for 4-PL model. Can be entered as a single value to assign a global guessing parameter or may be entered as a numeric vector corresponding to each item |
SE |
logical; estimate the standard errors by computing the parameter information matrix?
See |
covdata |
a data.frame of data used for latent regression models |
formula |
an R formula (or list of formulas) indicating how the latent traits
can be regressed using external covariates in |
SE.type |
type of estimation method to use for calculating the parameter information matrix
for computing standard errors and
Note that both the |
method |
a character object specifying the estimation algorithm to be used. The default is
The |
optimizer |
a character indicating which numerical optimizer to use. By default, the EM
algorithm will use the Other options include the Newton-Raphson ( Additionally, estimation subroutines from the |
dentype |
type of density form to use for the latent trait parameters. Current options include
Note that when |
pars |
a data.frame with the structure of how the starting values, parameter numbers,
estimation logical values, etc, are defined. The user may observe how the model defines the
values by using |
constrain |
a list of user declared equality constraints. To see how to define the
parameters correctly use |
calcNull |
logical; calculate the Null model for additional fit statistics (e.g., TLI)? Only applicable if the data contains no NA's and the data is not overly sparse |
draws |
the number of Monte Carlo draws to estimate the log-likelihood for the MH-RM algorithm. Default is 5000 |
survey.weights |
a optional numeric vector of survey weights to apply for each case in the
data (EM estimation only). If not specified, all cases are weighted equally (the standard IRT
approach). The sum of the |
quadpts |
number of quadrature points per dimension (must be larger than 2).
By default the number of quadrature uses the following scheme:
|
TOL |
convergence threshold for EM or MH-RM; defaults are .0001 and .001. If
|
gpcm_mats |
a list of matrices specifying how the scoring coefficients in the (generalized)
partial credit model should be constructed. If omitted, the standard gpcm format will be used
(i.e., |
grsm.block |
an optional numeric vector indicating where the blocking should occur when
using the grsm, NA represents items that do not belong to the grsm block (other items that may
be estimated in the test data). For example, to specify two blocks of 3 with a 2PL item for
the last item: |
rsm.block |
same as |
monopoly.k |
a vector of values (or a single value to repeated for each item) which indicate
the degree of the monotone polynomial fitted, where the monotone polynomial
corresponds to |
key |
a numeric vector of the response scoring key. Required when using nested logit item
types, and must be the same length as the number of items used. Items that are not nested logit
will ignore this vector, so use |
large |
a Alternatively, if the collapse table of frequencies is desired for the purpose of saving computations
(i.e., only computing the collapsed frequencies for the data onte-time) then a character vector can
be passed with the arguement
|
GenRandomPars |
logical; generate random starting values prior to optimization instead of using the fixed internal starting values? |
accelerate |
a character vector indicating the type of acceleration to use. Default
is |
verbose |
logical; print observed- (EM) or complete-data (MHRM) log-likelihood after each iteration cycle? Default is TRUE |
solnp_args |
a list of arguments to be passed to the |
nloptr_args |
a list of arguments to be passed to the |
spline_args |
a named list of lists containing information to be passed to the
This code input changes the |
control |
a list passed to the respective optimizers (i.e., |
technical |
a list containing lower level technical parameters for estimation. May be:
|
... |
additional arguments to be passed |
function returns an object of class SingleGroupClass
(SingleGroupClass-class)
Specification of the confirmatory item factor analysis model follows many of
the rules in the structural equation modeling framework for confirmatory factor analysis. The
variances of the latent factors are automatically fixed to 1 to help
facilitate model identification. All parameters may be fixed to constant
values or set equal to other parameters using the appropriate declarations.
Confirmatory models may also contain 'explanatory' person or item level predictors, though
including predictors is currently limited to the mixedmirt
function.
When specifying a single number greater than 1 as the model
input to mirt
an exploratory IRT model will be estimated. Rotation and target matrix options are available
if they are passed to generic functions such as summary-method
and
fscores
. Factor means and variances are fixed to ensure proper identification.
If the model is an exploratory item factor analysis estimation will begin
by computing a matrix of quasi-polychoric correlations. A
factor analysis with nfact
is then extracted and item parameters are
estimated by a_{ij} = f_{ij}/u_j
, where f_{ij}
is the factor
loading for the jth item on the ith factor, and u_j
is
the square root of the factor uniqueness, \sqrt{1 - h_j^2}
. The
initial intercept parameters are determined by calculating the inverse
normal of the item facility (i.e., item easiness), q_j
, to obtain
d_j = q_j / u_j
. A similar implementation is also used for obtaining
initial values for polytomous items.
Internally the g
and u
parameters are transformed using a logit
transformation (log(x/(1-x))
), and can be reversed by using 1 / (1 + exp(-x))
following convergence. This also applies when computing confidence intervals for these
parameters, and is done so automatically if coef(mod, rawug = FALSE)
.
As such, when applying prior distributions to these parameters it is recommended to use a prior
that ranges from negative infinity to positive infinity, such as the normally distributed
prior via the 'norm'
input (see mirt.model
).
Unrestricted full-information factor analysis is known to have problems with convergence, and some items may need to be constrained or removed entirely to allow for an acceptable solution. As a general rule dichotomous items with means greater than .95, or items that are only .05 greater than the guessing parameter, should be considered for removal from the analysis or treated with prior parameter distributions. The same type of reasoning is applicable when including upper bound parameters as well. For polytomous items, if categories are rarely endorsed then this will cause similar issues. Also, increasing the number of quadrature points per dimension, or using the quasi-Monte Carlo integration method, may help to stabilize the estimation process in higher dimensions. Finally, solutions that are not well defined also will have difficulty converging, and can indicate that the model has been misspecified (e.g., extracting too many dimensions).
For the MH-RM algorithm, when the number of iterations grows very high (e.g., greater than 1500)
or when Max Change = .2500
values are repeatedly printed
to the console too often (indicating that the parameters were being constrained since they are
naturally moving in steps greater than 0.25) then the model may either be ill defined or have a
very flat likelihood surface, and genuine maximum-likelihood parameter estimates may be difficult
to find. Standard errors are computed following the model convergence by passing
SE = TRUE
, to perform an addition MH-RM stage but treating the maximum-likelihood
estimates as fixed points.
Additional functions are available in the package which can be useful pre- and post-estimation. These are:
mirt.model
Define the IRT model specification use special syntax. Useful for defining between and within group parameter constraints, prior parameter distributions, and specifying the slope coefficients for each factor
coef-method
Extract raw coefficients from the model, along with their standard errors and confidence intervals
summary-method
Extract standardized loadings from model. Accepts a rotate
argument for exploratory
item response model
anova-method
Compare nested models using likelihood ratio statistics as well as information criteria such as the AIC and BIC
residuals-method
Compute pairwise residuals between each item using methods such as the LD statistic (Chen & Thissen, 1997), as well as response pattern residuals
plot-method
Plot various types of test level plots including the test score and information functions and more
itemplot
Plot various types of item level plots, including the score, standard error, and information functions, and more
createItem
Create a customized itemtype
that does not currently exist in the package
imputeMissing
Impute missing data given some computed Theta matrix
fscores
Find predicted scores for the latent traits using estimation methods such as EAP, MAP, ML, WLE, and EAPsum
wald
Compute Wald statistics follow the convergence of a model with a suitable information matrix
M2
Limited information goodness of fit test statistic based to determine how well the model fits the data
itemfit
and personfit
Goodness of fit statistics at the item and person levels, such as the S-X2, infit, outfit, and more
boot.mirt
Compute estimated parameter confidence intervals via the bootstrap methods
mirtCluster
Define a cluster for the package functions to use for capitalizing on multi-core architecture to utilize available CPUs when possible. Will help to decrease estimation times for tasks that can be run in parallel
The parameter labels use the follow convention, here using two factors and K
as the total
number of categories (using k
for specific category instances).
Only one intercept estimated, and the latent variance of \theta
is freely estimated. If
the data have more than two categories then a partial credit model is used instead (see
'gpcm' below).
P(x = 1|\theta, d) = \frac{1}{1 + exp(-(\theta + d))}
Depending on the model u
may be equal to 1 and g
may be equal to 0.
P(x = 1|\theta, \psi) = g + \frac{(u - g)}{
1 + exp(-(a_1 * \theta_1 + a_2 * \theta_2 + d))}
Currently restricted to unidimensional models
P(x = 1|\theta, \psi) = g + \frac{(u - g)}{
1 + exp(-(a_1 * \theta_1 + d))^S}
where S
allows for asymmetry in the response function and
is transformation constrained to be greater than 0 (i.e., log(S)
is estimated rather than S
)
Complementary log-log model (see Shim, Bonifay, and Wiedermann, 2022)
P(x = 1|\theta, b) = 1 - exp(-exp(\theta - b))
Currently restricted to unidimensional dichotomous data.
The graded model consists of sequential 2PL models,
P(x = k | \theta, \psi) = P(x \ge k | \theta, \phi) - P(x \ge k + 1 | \theta, \phi)
Note that P(x \ge 1 | \theta, \phi) = 1
while P(x \ge K + 1 | \theta, \phi) = 0
The unipolar log-logistic model (ULL; Lucke, 2015) is defined the same as the graded response model, however
P(x \le k | \theta, \psi) = \frac{\lambda_k\theta^\eta}{1 + \lambda_k\theta^\eta}
.
Internally the \lambda
parameters are exponentiated to keep them positive, and should
therefore the reported estimates should be interpreted in log units
A more constrained version of the graded model where graded spacing is equal across item
blocks and only adjusted by a single 'difficulty' parameter (c) while the latent variance
of \theta
is freely estimated (see Muraki, 1990 for this exact form).
This is restricted to unidimensional models only.
For the gpcm the d
values are treated as fixed and ordered values
from 0:(K-1)
(in the nominal model d_0
is also set to 0). Additionally, for
identification in the nominal model ak_0 = 0
, ak_{(K-1)} = (K - 1)
.
P(x = k | \theta, \psi) =
\frac{exp(ak_{k-1} * (a_1 * \theta_1 + a_2 * \theta_2) + d_{k-1})}
{\sum_{k=1}^K exp(ak_{k-1} * (a_1 * \theta_1 + a_2 * \theta_2) + d_{k-1})}
For the partial credit model (when itemtype = 'Rasch'
; unidimensional only) the above
model is further constrained so that ak = (0,1,\ldots, K-1)
, a_1 = 1
, and the
latent variance of \theta_1
is freely estimated. Alternatively, the partial credit model
can be obtained by containing all the slope parameters in the gpcms to be equal.
More specific scoring function may be included by passing a suitable list or matrices
to the gpcm_mats
input argument.
In the nominal model this parametrization helps to identify the empirical ordering of the
categories by inspecting the ak
values. Larger values indicate that the item category
is more positively related to the latent trait(s) being measured. For instance, if an item
was truly ordinal (such as a Likert scale), and had 4 response categories, we would expect
to see ak_0 < ak_1 < ak_2 < ak_3
following estimation. If on the other hand
ak_0 > ak_1
then it would appear that the second category is less related to to the
trait than the first, and therefore the second category should be understood as the
'lowest score'.
NOTE: The nominal model can become numerical unstable if poor choices for the high and low
values are chosen, resulting in ak
values greater than abs(10)
or more. It is
recommended to choose high and low anchors that cause the estimated parameters to fall
between 0 and K - 1
either by theoretical means or by re-estimating
the model with better values following convergence.
The gpcmIRT model is the classical generalized partial credit model for unidimensional response
data. It will obtain the same fit as the gpcm
presented above, however the parameterization
allows for the Rasch/generalized rating scale model as a special case.
E.g., for a K = 4 category response model,
P(x = 0 | \theta, \psi) = exp(0) / G
P(x = 1 | \theta, \psi) = exp(a(\theta - b1) + c) / G
P(x = 2 | \theta, \psi) = exp(a(2\theta - b1 - b2) + 2c) / G
P(x = 3 | \theta, \psi) = exp(a(3\theta - b1 - b2 - b3) + 3c) / G
where
G = exp(0) + exp(a(\theta - b1) + c) + exp(a(2\theta - b1 - b2) + 2c) +
exp(a(3\theta - b1 - b2 - b3) + 3c)
Here a
is the slope parameter, the b
parameters are the threshold
values for each adjacent category, and c
is the so-called difficulty parameter when
a rating scale model is fitted (otherwise, c = 0
and it drops out of the computations).
The gpcmIRT can be constrained to the partial credit IRT model by either constraining all the slopes to be equal, or setting the slopes to 1 and freeing the latent variance parameter.
Finally, the rsm is a more constrained version of the (generalized) partial credit model where the spacing is equal across item blocks and only adjusted by a single 'difficulty' parameter (c). Note that this is analogous to the relationship between the graded model and the grsm (with an additional constraint regarding the fixed discrimination parameters).
The multidimensional sequential response model has the form
P(x = k | \theta, \psi) = \prod (1 - F(a_1 \theta_1 + a_2 \theta_2 + d_{sk}))
F(a_1 \theta_1 + a_2 \theta_2 + d_{jk})
where F(\cdot)
is the cumulative logistic function.
The Tutz variant of this model (Tutz, 1990) (via itemtype = 'Tutz'
)
assumes that the slope terms are all equal to 1 and the latent
variance terms are estimated (i.e., is a Rasch variant).
The ideal point model has the form, with the upper bound constraint on d
set to 0:
P(x = 1 | \theta, \psi) = exp(-0.5 * (a_1 * \theta_1 + a_2 * \theta_2 + d)^2)
Partially compensatory models consist of the product of 2PL probability curves.
P(x = 1 | \theta, \psi) = g + (1 - g) (\frac{1}{1 + exp(-(a_1 * \theta_1 + d_1))} *
\frac{1}{1 + exp(-(a_2 * \theta_2 + d_2))})
Note that constraining the slopes to be equal across items will reduce the model to Embretson's (a.k.a. Whitely's) multicomponent model (1980).
Nested logistic curves for modeling distractor items. Requires a scoring key. The model is broken into two components for the probability of endorsement. For successful endorsement the probability trace is the 1-4PL model, while for unsuccessful endorsement:
P(x = 0 | \theta, \psi) =
(1 - P_{1-4PL}(x = 1 | \theta, \psi)) * P_{nominal}(x = k | \theta, \psi)
which is the product of the complement of the dichotomous trace line with the nominal
response model. In the nominal model, the slope parameters defined above are constrained
to be 1's, while the last value of the ak
is freely estimated.
The (multidimensional) generalized graded unfolding model is a class of ideal point models useful for ordinal response data. The form is
P(z=k|\theta,\psi)=\frac{exp\left[\left(z\sqrt{\sum_{d=1}^{D}
a_{id}^{2}(\theta_{jd}-b_{id})^{2}}\right)+\sum_{k=0}^{z}\psi_{ik}\right]+
exp\left[\left((M-z)\sqrt{\sum_{d=1}^{D}a_{id}^{2}(\theta_{jd}-b_{id})^{2}}\right)+
\sum_{k=0}^{z}\psi_{ik}\right]}{\sum_{w=0}^{C}\left(exp\left[\left(w
\sqrt{\sum_{d=1}^{D}a_{id}^{2}(\theta_{jd}-b_{id})^{2}}\right)+
\sum_{k=0}^{z}\psi_{ik}\right]+exp\left[\left((M-w)
\sqrt{\sum_{d=1}^{D}a_{id}^{2}(\theta_{jd}-b_{id})^{2}}\right)+
\sum_{k=0}^{z}\psi_{ik}\right]\right)}
where \theta_{jd}
is the location of the j
th individual on the d
th dimension,
b_{id}
is the difficulty location of the i
th item on the d
th dimension,
a_{id}
is the discrimination of the j
th individual on the d
th dimension
(where the discrimination values are constrained to be positive),
\psi_{ik}
is the k
th subjective response category threshold for the i
th item,
assumed to be symmetric about the item and constant across dimensions, where
\psi_{ik} = \sum_{d=1}^D a_{id} t_{ik}
z = 1,2,\ldots, C
(where C
is the number of categories minus 1),
and M = 2C + 1
.
Spline response models attempt to model the response curves uses non-linear and potentially non-monotonic patterns. The form is
P(x = 1|\theta, \eta) = \frac{1}{1 + exp(-(\eta_1 * X_1 + \eta_2 * X_2 + \cdots + \eta_n * X_n))}
where the X_n
are from the spline design matrix X
organized from the grid of \theta
values. B-splines with a natural or polynomial basis are supported, and the intercept
input is
set to TRUE
by default.
Monotone polynomial model for polytomous response data of the form
P(x = k | \theta, \psi) =
\frac{exp(\sum_1^k (m^*(\psi) + \xi_{c-1})}
{\sum_1^C exp(\sum_1^K (m^*(\psi) + \xi_{c-1}))}
where m^*(\psi)
is the monotone polynomial function without the intercept.
To access examples, vignettes, and exercise files that have been generated with knitr please visit https://github.com/philchalmers/mirt/wiki.
Phil Chalmers rphilip.chalmers@gmail.com
Andrich, D. (1978). A rating scale formulation for ordered response categories. Psychometrika, 43, 561-573.
Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443-459.
Bock, R. D., Gibbons, R., & Muraki, E. (1988). Full-Information Item Factor Analysis. Applied Psychological Measurement, 12(3), 261-280.
Bock, R. D. & Lieberman, M. (1970). Fitting a response model for n dichotomously scored items. Psychometrika, 35, 179-197.
Cai, L. (2010a). High-Dimensional exploratory item factor analysis by a Metropolis-Hastings Robbins-Monro algorithm. Psychometrika, 75, 33-57.
Cai, L. (2010b). Metropolis-Hastings Robbins-Monro algorithm for confirmatory item factor analysis. Journal of Educational and Behavioral Statistics, 35, 307-335.
Chalmers, R., P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 1-29. doi:10.18637/jss.v048.i06
Chalmers, R. P. (2015). Extended Mixed-Effects Item Response Models with the MH-RM Algorithm. Journal of Educational Measurement, 52, 200-222. doi:10.1111/jedm.12072
Chalmers, R. P. (2018). Numerical Approximation of the Observed Information Matrix with Oakes' Identity. British Journal of Mathematical and Statistical Psychology DOI: 10.1111/bmsp.12127
Chalmers, R., P. & Flora, D. (2014). Maximum-likelihood Estimation of Noncompensatory IRT Models with the MH-RM Algorithm. Applied Psychological Measurement, 38, 339-358. doi:10.1177/0146621614520958
Chen, W. H. & Thissen, D. (1997). Local dependence indices for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265-289.
Falk, C. F. & Cai, L. (2016). Maximum Marginal Likelihood Estimation of a Monotonic Polynomial Generalized Partial Credit Model with Applications to Multiple Group Analysis. Psychometrika, 81, 434-460.
Lord, F. M. & Novick, M. R. (1968). Statistical theory of mental test scores. Addison-Wesley.
Lucke, J. F. (2015). Unipolar item response models. In S. P. Reise & D. A. Revicki (Eds.), Handbook of item response theory modeling: Applications to typical performance assessment (pp. 272-284). New York, NY: Routledge/Taylor & Francis Group.
Ramsay, J. O. (1975). Solving implicit equations in psychometric data analysis. Psychometrika, 40, 337-360.
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Danish Institute for Educational Research.
Roberts, J. S., Donoghue, J. R., & Laughlin, J. E. (2000). A General Item Response Theory Model for Unfolding Unidimensional Polytomous Responses. Applied Psychological Measurement, 24, 3-32.
Shim, H., Bonifay, W., & Wiedermann, W. (2022). Parsimonious asymmetric item response theory modeling with the complementary log-log link. Behavior Research Methods, 55, 200-219.
Maydeu-Olivares, A., Hernandez, A. & McDonald, R. P. (2006). A Multidimensional Ideal Point Item Response Theory Model for Binary Data. Multivariate Behavioral Research, 41, 445-471.
Muraki, E. (1990). Fitting a polytomous item response model to Likert-type data. Applied Psychological Measurement, 14, 59-71.
Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 159-176.
Muraki, E. & Carlson, E. B. (1995). Full-information factor analysis for polytomous item responses. Applied Psychological Measurement, 19, 73-90.
Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monographs, 34.
Suh, Y. & Bolt, D. (2010). Nested logit models for multiple-choice item response data. Psychometrika, 75, 454-473.
Sympson, J. B. (1977). A model for testing with multidimensional items. Proceedings of the 1977 Computerized Adaptive Testing Conference.
Thissen, D. (1982). Marginal maximum likelihood estimation for the one-parameter logistic model. Psychometrika, 47, 175-186.
Tutz, G. (1990). Sequential item response models with ordered response. British Journal of Mathematical and Statistical Psychology, 43, 39-55.
Varadhan, R. & Roland, C. (2008). Simple and Globally Convergent Methods for Accelerating the Convergence of Any EM Algorithm. Scandinavian Journal of Statistics, 35, 335-353.
Whitely, S. E. (1980). Multicomponent latent trait models for ability tests. Psychometrika, 45(4), 470-494.
Wood, R., Wilson, D. T., Gibbons, R. D., Schilling, S. G., Muraki, E., & Bock, R. D. (2003). TESTFACT 4 for Windows: Test Scoring, Item Statistics, and Full-information Item Factor Analysis [Computer software]. Lincolnwood, IL: Scientific Software International.
Woods, C. M., and Lin, N. (2009). Item Response Theory With Estimation of the Latent Density Using Davidian Curves. Applied Psychological Measurement,33(2), 102-117.
bfactor
, multipleGroup
, mixedmirt
,
expand.table
, key2binary
, mod2values
,
extract.item
, iteminfo
, testinfo
,
probtrace
, simdata
, averageMI
,
fixef
, extract.mirt
, itemstats
# load LSAT section 7 data and compute 1 and 2 factor models
data <- expand.table(LSAT7)
itemstats(data)
## $overall
## N mean_total.score sd_total.score ave.r sd.r alpha
## 1000 3.707 1.199 0.143 0.052 0.453
##
## $itemstats
## N mean sd total.r total.r_if_rm alpha_if_rm
## Item.1 1000 0.828 0.378 0.530 0.246 0.396
## Item.2 1000 0.658 0.475 0.600 0.247 0.394
## Item.3 1000 0.772 0.420 0.611 0.313 0.345
## Item.4 1000 0.606 0.489 0.592 0.223 0.415
## Item.5 1000 0.843 0.364 0.461 0.175 0.438
##
## $proportions
## 0 1
## Item.1 0.172 0.828
## Item.2 0.342 0.658
## Item.3 0.228 0.772
## Item.4 0.394 0.606
## Item.5 0.157 0.843
(mod1 <- mirt(data, 1))
##
## Call:
## mirt(data = data, model = 1)
##
## Full-information item factor analysis with 1 factor(s).
## Converged within 1e-04 tolerance after 28 EM iterations.
## mirt version: 1.40
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Log-likelihood = -2658.805
## Estimated parameters: 10
## AIC = 5337.61
## BIC = 5386.688; SABIC = 5354.927
## G2 (21) = 31.7, p = 0.0628
## RMSEA = 0.023, CFI = NaN, TLI = NaN
coef(mod1)
## $Item.1
## a1 d g u
## par 0.988 1.856 0 1
##
## $Item.2
## a1 d g u
## par 1.081 0.808 0 1
##
## $Item.3
## a1 d g u
## par 1.706 1.804 0 1
##
## $Item.4
## a1 d g u
## par 0.765 0.486 0 1
##
## $Item.5
## a1 d g u
## par 0.736 1.855 0 1
##
## $GroupPars
## MEAN_1 COV_11
## par 0 1
summary(mod1)
## F1 h2
## Item.1 0.502 0.252
## Item.2 0.536 0.287
## Item.3 0.708 0.501
## Item.4 0.410 0.168
## Item.5 0.397 0.157
##
## SS loadings: 1.366
## Proportion Var: 0.273
##
## Factor correlations:
##
## F1
## F1 1
plot(mod1)
plot(mod1, type = 'trace')
## No test:
(mod2 <- mirt(data, 1, SE = TRUE)) #standard errors via the Oakes method
##
## Call:
## mirt(data = data, model = 1, SE = TRUE)
##
## Full-information item factor analysis with 1 factor(s).
## Converged within 1e-04 tolerance after 28 EM iterations.
## mirt version: 1.40
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Information matrix estimated with method: Oakes
## Second-order test: model is a possible local maximum
## Condition number of information matrix = 30.23088
##
## Log-likelihood = -2658.805
## Estimated parameters: 10
## AIC = 5337.61
## BIC = 5386.688; SABIC = 5354.927
## G2 (21) = 31.7, p = 0.0628
## RMSEA = 0.023, CFI = NaN, TLI = NaN
(mod2 <- mirt(data, 1, SE = TRUE, SE.type = 'SEM')) #standard errors with SEM method
##
## Call:
## mirt(data = data, model = 1, SE = TRUE, SE.type = "SEM")
##
## Full-information item factor analysis with 1 factor(s).
## Converged within 1e-05 tolerance after 74 EM iterations.
## mirt version: 1.40
## M-step optimizer: BFGS
## EM acceleration: none
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Information matrix estimated with method: SEM
## Second-order test: model is a possible local maximum
## Condition number of information matrix = 30.13481
##
## Log-likelihood = -2658.805
## Estimated parameters: 10
## AIC = 5337.61
## BIC = 5386.688; SABIC = 5354.927
## G2 (21) = 31.7, p = 0.0628
## RMSEA = 0.023, CFI = NaN, TLI = NaN
coef(mod2)
## $Item.1
## a1 d g u
## par 0.988 1.856 0 1
## CI_2.5 0.639 1.599 NA NA
## CI_97.5 1.336 2.112 NA NA
##
## $Item.2
## a1 d g u
## par 1.081 0.808 0 1
## CI_2.5 0.755 0.629 NA NA
## CI_97.5 1.407 0.987 NA NA
##
## $Item.3
## a1 d g u
## par 1.707 1.805 0 1
## CI_2.5 1.086 1.395 NA NA
## CI_97.5 2.329 2.215 NA NA
##
## $Item.4
## a1 d g u
## par 0.765 0.486 0 1
## CI_2.5 0.500 0.339 NA NA
## CI_97.5 1.030 0.633 NA NA
##
## $Item.5
## a1 d g u
## par 0.736 1.854 0 1
## CI_2.5 0.437 1.630 NA NA
## CI_97.5 1.034 2.079 NA NA
##
## $GroupPars
## MEAN_1 COV_11
## par 0 1
## CI_2.5 NA NA
## CI_97.5 NA NA
(mod3 <- mirt(data, 1, SE = TRUE, SE.type = 'Richardson')) #with numerical Richardson method
##
## Call:
## mirt(data = data, model = 1, SE = TRUE, SE.type = "Richardson")
##
## Full-information item factor analysis with 1 factor(s).
## Converged within 1e-04 tolerance after 28 EM iterations.
## mirt version: 1.40
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Information matrix estimated with method: Richardson
## Second-order test: model is a possible local maximum
## Condition number of information matrix = 30.23102
##
## Log-likelihood = -2658.805
## Estimated parameters: 10
## AIC = 5337.61
## BIC = 5386.688; SABIC = 5354.927
## G2 (21) = 31.7, p = 0.0628
## RMSEA = 0.023, CFI = NaN, TLI = NaN
residuals(mod1)
## LD matrix (lower triangle) and standardized values.
##
## Upper triangle summary:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.037 -0.020 -0.007 0.001 0.024 0.051
##
## Item.1 Item.2 Item.3 Item.4 Item.5
## Item.1 NA -0.021 -0.029 0.051 0.049
## Item.2 0.453 NA 0.033 -0.016 -0.037
## Item.3 0.854 1.060 NA -0.012 -0.002
## Item.4 2.572 0.267 0.153 NA 0.000
## Item.5 2.389 1.384 0.003 0.000 NA
plot(mod1) #test score function
plot(mod1, type = 'trace') #trace lines
plot(mod2, type = 'info') #test information
plot(mod2, MI=200) #expected total score with 95% confidence intervals
# estimated 3PL model for item 5 only
(mod1.3PL <- mirt(data, 1, itemtype = c('2PL', '2PL', '2PL', '2PL', '3PL')))
##
## Call:
## mirt(data = data, model = 1, itemtype = c("2PL", "2PL", "2PL",
## "2PL", "3PL"))
##
## Full-information item factor analysis with 1 factor(s).
## Converged within 1e-04 tolerance after 43 EM iterations.
## mirt version: 1.40
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Log-likelihood = -2658.794
## Estimated parameters: 11
## AIC = 5339.587
## BIC = 5393.573; SABIC = 5358.636
## G2 (20) = 31.68, p = 0.0469
## RMSEA = 0.024, CFI = NaN, TLI = NaN
coef(mod1.3PL)
## $Item.1
## a1 d g u
## par 0.987 1.855 0 1
##
## $Item.2
## a1 d g u
## par 1.082 0.808 0 1
##
## $Item.3
## a1 d g u
## par 1.706 1.805 0 1
##
## $Item.4
## a1 d g u
## par 0.764 0.486 0 1
##
## $Item.5
## a1 d g u
## par 0.778 1.643 0.161 1
##
## $GroupPars
## MEAN_1 COV_11
## par 0 1
# internally g and u pars are stored as logits, so usually a good idea to include normal prior
# to help stabilize the parameters. For a value around .182 use a mean
# of -1.5 (since 1 / (1 + exp(-(-1.5))) == .182)
model <- 'F = 1-5
PRIOR = (5, g, norm, -1.5, 3)'
mod1.3PL.norm <- mirt(data, model, itemtype = c('2PL', '2PL', '2PL', '2PL', '3PL'))
coef(mod1.3PL.norm)
## $Item.1
## a1 d g u
## par 0.987 1.855 0 1
##
## $Item.2
## a1 d g u
## par 1.083 0.808 0 1
##
## $Item.3
## a1 d g u
## par 1.706 1.804 0 1
##
## $Item.4
## a1 d g u
## par 0.764 0.486 0 1
##
## $Item.5
## a1 d g u
## par 0.788 1.6 0.19 1
##
## $GroupPars
## MEAN_1 COV_11
## par 0 1
#limited information fit statistics
M2(mod1.3PL.norm)
## M2 df p RMSEA RMSEA_5 RMSEA_95 SRMSR TLI
## stats 8.800082 4 0.06629543 0.03465864 0 0.06610847 0.03207363 0.9454563
## CFI
## stats 0.9781825
# unidimensional ideal point model
idealpt <- mirt(data, 1, itemtype = 'ideal')
plot(idealpt, type = 'trace', facet_items = TRUE)
plot(idealpt, type = 'trace', facet_items = FALSE)
# two factors (exploratory)
mod2 <- mirt(data, 2)
coef(mod2)
## $Item.1
## a1 a2 d g u
## par -2.007 0.87 2.648 0 1
##
## $Item.2
## a1 a2 d g u
## par -0.849 -0.522 0.788 0 1
##
## $Item.3
## a1 a2 d g u
## par -2.153 -1.836 2.483 0 1
##
## $Item.4
## a1 a2 d g u
## par -0.756 -0.028 0.485 0 1
##
## $Item.5
## a1 a2 d g u
## par -0.757 0 1.864 0 1
##
## $GroupPars
## MEAN_1 MEAN_2 COV_11 COV_21 COV_22
## par 0 0 1 0 1
summary(mod2, rotate = 'oblimin') #oblimin rotation
##
## Rotation: oblimin
##
## Rotated factor loadings:
##
## F1 F2 h2
## Item.1 0.7943 -0.0111 0.623
## Item.2 0.0804 0.4630 0.255
## Item.3 -0.0129 0.8628 0.734
## Item.4 0.2794 0.1925 0.165
## Item.5 0.2930 0.1772 0.165
##
## Rotated SS loadings: 0.801 1.027
##
## Factor correlations:
##
## F1 F2
## F1 1.000
## F2 0.463 1
residuals(mod2)
## LD matrix (lower triangle) and standardized values.
##
## Upper triangle summary:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.018 -0.001 0.000 0.000 0.002 0.011
##
## Item.1 Item.2 Item.3 Item.4 Item.5
## Item.1 NA -0.001 0.001 0.002 0.003
## Item.2 0.001 NA 0.000 0.011 -0.018
## Item.3 0.001 0.000 NA -0.002 0.006
## Item.4 0.002 0.111 0.004 NA -0.001
## Item.5 0.008 0.325 0.041 0.001 NA
plot(mod2)
plot(mod2, rotate = 'oblimin')
anova(mod1, mod2) #compare the two models
## AIC SABIC HQ BIC logLik X2 df p
## mod1 5337.610 5354.927 5356.263 5386.688 -2658.805
## mod2 5335.039 5359.283 5361.153 5403.748 -2653.520 10.571 4 0.032
scoresfull <- fscores(mod2) #factor scores for each response pattern
head(scoresfull)
## F1 F2
## [1,] -1.700489 -1.711744
## [2,] -1.700489 -1.711744
## [3,] -1.700489 -1.711744
## [4,] -1.700489 -1.711744
## [5,] -1.700489 -1.711744
## [6,] -1.700489 -1.711744
scorestable <- fscores(mod2, full.scores = FALSE) #save factor score table
##
## Method: EAP
## Rotate: oblimin
##
## Empirical Reliability:
##
## F1 F2
## 0.2717 0.3565
head(scorestable)
## Item.1 Item.2 Item.3 Item.4 Item.5 F1 F2 SE_F1
## [1,] 0 0 0 0 0 -1.700489 -1.7117442 0.8233675
## [2,] 0 0 0 0 1 -1.442162 -1.5314978 0.8291769
## [3,] 0 0 0 1 0 -1.448947 -1.5246145 0.8289843
## [4,] 0 0 0 1 1 -1.186209 -1.3432197 0.8376304
## [5,] 0 0 1 0 0 -1.369423 -0.7081262 0.8344842
## [6,] 0 0 1 0 1 -1.099281 -0.5103192 0.8455456
## SE_F2
## [1,] 0.7705973
## [2,] 0.7691720
## [3,] 0.7691340
## [4,] 0.7711533
## [5,] 0.7963088
## [6,] 0.8101448
# confirmatory (as an example, model is not identified since you need 3 items per factor)
# Two ways to define a confirmatory model: with mirt.model, or with a string
# these model definitions are equivalent
cmodel <- mirt.model('
F1 = 1,4,5
F2 = 2,3')
cmodel2 <- 'F1 = 1,4,5
F2 = 2,3'
cmod <- mirt(data, cmodel)
# cmod <- mirt(data, cmodel2) # same as above
coef(cmod)
## $Item.1
## a1 a2 d g u
## par 1.792 0 2.358 0 1
##
## $Item.2
## a1 a2 d g u
## par 0 1.427 0.9 0 1
##
## $Item.3
## a1 a2 d g u
## par 0 1.559 1.725 0 1
##
## $Item.4
## a1 a2 d g u
## par 0.743 0 0.483 0 1
##
## $Item.5
## a1 a2 d g u
## par 0.763 0 1.867 0 1
##
## $GroupPars
## MEAN_1 MEAN_2 COV_11 COV_21 COV_22
## par 0 0 1 0 1
anova(cmod, mod2)
## AIC SABIC HQ BIC logLik X2 df p
## cmod 5392.596 5409.913 5411.249 5441.674 -2686.298
## mod2 5335.039 5359.283 5361.153 5403.748 -2653.520 65.557 4 0
# check if identified by computing information matrix
(cmod <- mirt(data, cmodel, SE = TRUE))
## Warning: Could not invert information matrix; model may not be empirically
## identified.
##
## Call:
## mirt(data = data, model = cmodel, SE = TRUE)
##
## Full-information item factor analysis with 2 factor(s).
## Converged within 1e-04 tolerance after 125 EM iterations.
## mirt version: 1.40
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 31
## Latent density type: Gaussian
##
## Information matrix estimated with method: Oakes
## Second-order test: model is not a maximum or the information matrix is too inaccurate
##
## Log-likelihood = -2686.298
## Estimated parameters: 10
## AIC = 5392.596
## BIC = 5441.674; SABIC = 5409.913
## G2 (21) = 86.69, p = 0
## RMSEA = 0.056, CFI = NaN, TLI = NaN
###########
# data from the 'ltm' package in numeric format
itemstats(Science)
## $overall
## N mean_total.score sd_total.score ave.r sd.r alpha
## 392 11.668 2.003 0.275 0.098 0.598
##
## $itemstats
## N mean sd total.r total.r_if_rm alpha_if_rm
## Comfort 392 3.120 0.588 0.596 0.352 0.552
## Work 392 2.722 0.807 0.666 0.332 0.567
## Future 392 2.990 0.757 0.748 0.488 0.437
## Benefit 392 2.837 0.802 0.684 0.363 0.541
##
## $proportions
## 1 2 3 4
## Comfort 0.013 0.082 0.679 0.227
## Work 0.084 0.250 0.526 0.140
## Future 0.036 0.184 0.536 0.245
## Benefit 0.054 0.255 0.492 0.199
pmod1 <- mirt(Science, 1)
plot(pmod1)
plot(pmod1, type = 'trace')
plot(pmod1, type = 'itemscore')
summary(pmod1)
## F1 h2
## Comfort 0.522 0.273
## Work 0.584 0.342
## Future 0.803 0.645
## Benefit 0.541 0.293
##
## SS loadings: 1.552
## Proportion Var: 0.388
##
## Factor correlations:
##
## F1
## F1 1
# Constrain all slopes to be equal with the constrain = list() input or mirt.model() syntax
# first obtain parameter index
values <- mirt(Science,1, pars = 'values')
values #note that slopes are numbered 1,5,9,13, or index with values$parnum[values$name == 'a1']
## group item class name parnum value lbound ubound est
## 1 all Comfort graded a1 1 0.8510000 -Inf Inf TRUE
## 2 all Comfort graded d1 2 4.3896709 -Inf Inf TRUE
## 3 all Comfort graded d2 3 2.5828175 -Inf Inf TRUE
## 4 all Comfort graded d3 4 -1.4712783 -Inf Inf TRUE
## 5 all Work graded a1 5 0.8510000 -Inf Inf TRUE
## 6 all Work graded d1 6 2.7071399 -Inf Inf TRUE
## 7 all Work graded d2 7 0.8419146 -Inf Inf TRUE
## 8 all Work graded d3 8 -2.1204510 -Inf Inf TRUE
## 9 all Future graded a1 9 0.8510000 -Inf Inf TRUE
## 10 all Future graded d1 10 3.5429316 -Inf Inf TRUE
## 11 all Future graded d2 11 1.5216586 -Inf Inf TRUE
## 12 all Future graded d3 12 -1.3573021 -Inf Inf TRUE
## 13 all Benefit graded a1 13 0.8510000 -Inf Inf TRUE
## 14 all Benefit graded d1 14 3.1664313 -Inf Inf TRUE
## 15 all Benefit graded d2 15 0.9818914 -Inf Inf TRUE
## 16 all Benefit graded d3 16 -1.6612126 -Inf Inf TRUE
## 17 all GROUP GroupPars MEAN_1 17 0.0000000 -Inf Inf FALSE
## 18 all GROUP GroupPars COV_11 18 1.0000000 1e-04 Inf FALSE
## prior.type prior_1 prior_2
## 1 none NaN NaN
## 2 none NaN NaN
## 3 none NaN NaN
## 4 none NaN NaN
## 5 none NaN NaN
## 6 none NaN NaN
## 7 none NaN NaN
## 8 none NaN NaN
## 9 none NaN NaN
## 10 none NaN NaN
## 11 none NaN NaN
## 12 none NaN NaN
## 13 none NaN NaN
## 14 none NaN NaN
## 15 none NaN NaN
## 16 none NaN NaN
## 17 none NaN NaN
## 18 none NaN NaN
(pmod1_equalslopes <- mirt(Science, 1, constrain = list(c(1,5,9,13))))
##
## Call:
## mirt(data = Science, model = 1, constrain = list(c(1, 5, 9, 13)))
##
## Full-information item factor analysis with 1 factor(s).
## Converged within 1e-04 tolerance after 15 EM iterations.
## mirt version: 1.40
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Log-likelihood = -1613.899
## Estimated parameters: 16
## AIC = 3253.798
## BIC = 3305.425; SABIC = 3264.176
## G2 (242) = 223.62, p = 0.7959
## RMSEA = 0, CFI = NaN, TLI = NaN
coef(pmod1_equalslopes)
## $Comfort
## a1 d1 d2 d3
## par 1.321 5.165 2.844 -1.587
##
## $Work
## a1 d1 d2 d3
## par 1.321 2.992 0.934 -2.319
##
## $Future
## a1 d1 d2 d3
## par 1.321 4.067 1.662 -1.488
##
## $Benefit
## a1 d1 d2 d3
## par 1.321 3.55 1.057 -1.806
##
## $GroupPars
## MEAN_1 COV_11
## par 0 1
# using mirt.model syntax, constrain all item slopes to be equal
model <- 'F = 1-4
CONSTRAIN = (1-4, a1)'
(pmod1_equalslopes <- mirt(Science, model))
##
## Call:
## mirt(data = Science, model = model)
##
## Full-information item factor analysis with 1 factor(s).
## Converged within 1e-04 tolerance after 15 EM iterations.
## mirt version: 1.40
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Log-likelihood = -1613.899
## Estimated parameters: 16
## AIC = 3253.798
## BIC = 3305.425; SABIC = 3264.176
## G2 (242) = 223.62, p = 0.7959
## RMSEA = 0, CFI = NaN, TLI = NaN
coef(pmod1_equalslopes)
## $Comfort
## a1 d1 d2 d3
## par 1.321 5.165 2.844 -1.587
##
## $Work
## a1 d1 d2 d3
## par 1.321 2.992 0.934 -2.319
##
## $Future
## a1 d1 d2 d3
## par 1.321 4.067 1.662 -1.488
##
## $Benefit
## a1 d1 d2 d3
## par 1.321 3.55 1.057 -1.806
##
## $GroupPars
## MEAN_1 COV_11
## par 0 1
coef(pmod1_equalslopes)
## $Comfort
## a1 d1 d2 d3
## par 1.321 5.165 2.844 -1.587
##
## $Work
## a1 d1 d2 d3
## par 1.321 2.992 0.934 -2.319
##
## $Future
## a1 d1 d2 d3
## par 1.321 4.067 1.662 -1.488
##
## $Benefit
## a1 d1 d2 d3
## par 1.321 3.55 1.057 -1.806
##
## $GroupPars
## MEAN_1 COV_11
## par 0 1
anova(pmod1_equalslopes, pmod1) #significantly worse fit with almost all criteria
## AIC SABIC HQ BIC logLik X2 df p
## pmod1_equalslopes 3253.798 3264.176 3274.259 3305.425 -1613.899
## pmod1 3249.739 3262.512 3274.922 3313.279 -1608.870 10.059 3 0.018
pmod2 <- mirt(Science, 2)
summary(pmod2)
##
## Rotation: oblimin
##
## Rotated factor loadings:
##
## F1 F2 h2
## Comfort 0.6016 0.0312 0.382
## Work -0.0573 0.7971 0.592
## Future 0.3302 0.5153 0.548
## Benefit 0.7231 -0.0239 0.506
##
## Rotated SS loadings: 0.997 0.902
##
## Factor correlations:
##
## F1 F2
## F1 1.000
## F2 0.511 1
plot(pmod2, rotate = 'oblimin')
itemplot(pmod2, 1, rotate = 'oblimin')
anova(pmod1, pmod2)
## AIC SABIC HQ BIC logLik X2 df p
## pmod1 3249.739 3262.512 3274.922 3313.279 -1608.870
## pmod2 3241.938 3257.106 3271.843 3317.392 -1601.969 13.801 3 0.003
# unidimensional fit with a generalized partial credit and nominal model
(gpcmod <- mirt(Science, 1, 'gpcm'))
##
## Call:
## mirt(data = Science, model = 1, itemtype = "gpcm")
##
## Full-information item factor analysis with 1 factor(s).
## Converged within 1e-04 tolerance after 50 EM iterations.
## mirt version: 1.40
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Log-likelihood = -1612.683
## Estimated parameters: 16
## AIC = 3257.366
## BIC = 3320.906; SABIC = 3270.139
## G2 (239) = 221.19, p = 0.7896
## RMSEA = 0, CFI = NaN, TLI = NaN
coef(gpcmod)
## $Comfort
## a1 ak0 ak1 ak2 ak3 d0 d1 d2 d3
## par 0.865 0 1 2 3 0 2.831 5.324 3.998
##
## $Work
## a1 ak0 ak1 ak2 ak3 d0 d1 d2 d3
## par 0.841 0 1 2 3 0 1.711 2.578 0.848
##
## $Future
## a1 ak0 ak1 ak2 ak3 d0 d1 d2 d3
## par 2.204 0 1 2 3 0 4.601 6.759 4.918
##
## $Benefit
## a1 ak0 ak1 ak2 ak3 d0 d1 d2 d3
## par 0.724 0 1 2 3 0 2.099 2.899 1.721
##
## $GroupPars
## MEAN_1 COV_11
## par 0 1
# for the nominal model the lowest and highest categories are assumed to be the
# theoretically lowest and highest categories that related to the latent trait(s)
(nomod <- mirt(Science, 1, 'nominal'))
##
## Call:
## mirt(data = Science, model = 1, itemtype = "nominal")
##
## Full-information item factor analysis with 1 factor(s).
## Converged within 1e-04 tolerance after 71 EM iterations.
## mirt version: 1.40
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Log-likelihood = -1608.455
## Estimated parameters: 24
## AIC = 3264.91
## BIC = 3360.22; SABIC = 3284.069
## G2 (231) = 212.73, p = 0.8002
## RMSEA = 0, CFI = NaN, TLI = NaN
coef(nomod) #ordering of ak values suggest that the items are indeed ordinal
## $Comfort
## a1 ak0 ak1 ak2 ak3 d0 d1 d2 d3
## par 1.008 0 1.541 1.999 3 0 3.639 5.905 4.533
##
## $Work
## a1 ak0 ak1 ak2 ak3 d0 d1 d2 d3
## par 0.841 0 0.689 1.5 3 0 1.464 2.326 0.325
##
## $Future
## a1 ak0 ak1 ak2 ak3 d0 d1 d2 d3
## par 2.041 0 0.762 1.861 3 0 3.668 5.867 3.949
##
## $Benefit
## a1 ak0 ak1 ak2 ak3 d0 d1 d2 d3
## par 0.779 0 1.036 1.742 3 0 2.144 2.911 1.621
##
## $GroupPars
## MEAN_1 COV_11
## par 0 1
anova(gpcmod, nomod)
## AIC SABIC HQ BIC logLik X2 df p
## gpcmod 3257.366 3270.139 3282.549 3320.906 -1612.683
## nomod 3264.910 3284.069 3302.684 3360.220 -1608.455 8.456 8 0.39
itemplot(nomod, 3)
# generalized graded unfolding model
(ggum <- mirt(Science, 1, 'ggum'))
## EM cycles terminated after 500 iterations.
##
## Call:
## mirt(data = Science, model = 1, itemtype = "ggum")
##
## Full-information item factor analysis with 1 factor(s).
## FAILED TO CONVERGE within 1e-04 tolerance after 500 EM iterations.
## mirt version: 1.40
## M-step optimizer: nlminb
## EM acceleration: Ramsay
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Log-likelihood = -1624.054
## Estimated parameters: 20
## AIC = 3288.107
## BIC = 3367.533; SABIC = 3304.073
## G2 (235) = 243.93, p = 0.3309
## RMSEA = 0.01, CFI = NaN, TLI = NaN
coef(ggum, simplify=TRUE)
## $items
## a1 b1 t1 t2 t3
## Comfort 1.489 -0.484 3.190 2.634 -0.167
## Work 1.190 0.042 2.171 1.427 -0.720
## Future 4.164 -0.041 2.167 1.346 0.261
## Benefit 1.227 -0.475 2.775 1.497 -0.274
##
## $means
## F1
## 0
##
## $cov
## F1
## F1 1
plot(ggum)
plot(ggum, type = 'trace')
plot(ggum, type = 'itemscore')
# monotonic polyomial models
(monopoly <- mirt(Science, 1, 'monopoly'))
##
## Call:
## mirt(data = Science, model = 1, itemtype = "monopoly")
##
## Full-information item factor analysis with 1 factor(s).
## Converged within 1e-04 tolerance after 47 EM iterations.
## mirt version: 1.40
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Log-likelihood = -1601.175
## Estimated parameters: 24
## AIC = 3250.349
## BIC = 3345.66; SABIC = 3269.509
## G2 (231) = 198.17, p = 0.9424
## RMSEA = 0, CFI = NaN, TLI = NaN
coef(monopoly, simplify=TRUE)
## $items
## omega xi1 xi2 xi3 alpha1 tau2
## Comfort -1.437 2.916 2.218 -1.469 -0.937 0.739
## Work -0.411 1.378 0.698 -2.152 -0.498 -1.155
## Future 0.832 4.975 2.256 -1.911 0.017 -8.475
## Benefit -1.718 1.885 0.618 -1.388 -1.425 0.727
##
## $means
## F1
## 0
##
## $cov
## F1
## F1 1
plot(monopoly)
plot(monopoly, type = 'trace')
plot(monopoly, type = 'itemscore')
# unipolar IRT model
unimod <- mirt(Science, itemtype = 'ULL')
coef(unimod, simplify=TRUE)
## $items
## eta1 log_lambda1 log_lambda2 log_lambda3
## Comfort 1.175 4.780 2.299 -1.709
## Work 1.618 2.534 0.554 -2.736
## Future 2.803 4.034 1.526 -2.595
## Benefit 1.319 3.021 0.682 -1.995
##
## $GroupPars
## meanlog sdlog
## par 0 1
plot(unimod)
plot(unimod, type = 'trace')
itemplot(unimod, 1)
# following use the correct log-normal density for latent trait
itemfit(unimod)
## item S_X2 df.S_X2 RMSEA.S_X2 p.S_X2
## 1 Comfort 5.659 6 0.000 0.462
## 2 Work 10.147 8 0.026 0.255
## 3 Future 19.490 8 0.061 0.012
## 4 Benefit 12.110 11 0.016 0.355
M2(unimod, type = 'C2')
## EM cycles terminated after 500 iterations.
## M2 df p RMSEA RMSEA_5 RMSEA_95 SRMSR
## stats 18.70974 2 8.654271e-05 0.1461778 0.09032262 0.2096717 0.07859892
## TLI CFI
## stats 0.7380161 0.912672
fs <- fscores(unimod)
hist(fs, 20)
fscores(unimod, method = 'EAPsum', full.scores = FALSE)
## df X2 p.X2 rxx_F1
## stats 9 5.665525 0.7728707 0.5258804
## Sum.Scores F1 SE_F1 observed expected std.res
## 4 4 0.017 0.065 2 0.166 4.502
## 5 5 0.220 0.290 1 0.129 2.422
## 6 6 0.601 0.130 2 0.510 2.085
## 7 7 0.627 0.110 1 3.194 1.228
## 8 8 0.645 0.163 11 12.722 0.483
## 9 9 0.689 0.250 32 32.721 0.126
## 10 10 0.796 0.392 58 57.465 0.071
## 11 11 1.027 0.592 70 74.316 0.501
## 12 12 1.454 0.853 91 77.803 1.496
## 13 13 2.159 1.285 56 60.419 0.568
## 14 14 3.299 2.001 36 40.122 0.651
## 15 15 5.109 3.236 20 23.097 0.644
## 16 16 8.224 5.305 12 9.337 0.872
## example applying survey weights.
# weight the first half of the cases to be more representative of population
survey.weights <- c(rep(2, nrow(Science)/2), rep(1, nrow(Science)/2))
survey.weights <- survey.weights/sum(survey.weights) * nrow(Science)
unweighted <- mirt(Science, 1)
weighted <- mirt(Science, 1, survey.weights=survey.weights)
###########
# empirical dimensionality testing that includes 'guessing'
data(SAT12)
data <- key2binary(SAT12,
key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))
itemstats(data)
## $overall
## N mean_total.score sd_total.score ave.r sd.r alpha
## 600 18.202 5.054 0.108 0.075 0.798
##
## $itemstats
## N mean sd total.r total.r_if_rm alpha_if_rm
## Item.1 600 0.283 0.451 0.380 0.300 0.793
## Item.2 600 0.568 0.496 0.539 0.464 0.785
## Item.3 600 0.280 0.449 0.446 0.371 0.789
## Item.4 600 0.378 0.485 0.325 0.235 0.796
## Item.5 600 0.620 0.486 0.424 0.340 0.791
## Item.6 600 0.160 0.367 0.414 0.351 0.791
## Item.7 600 0.760 0.427 0.366 0.289 0.793
## Item.8 600 0.202 0.402 0.307 0.233 0.795
## Item.9 600 0.885 0.319 0.189 0.127 0.798
## Item.10 600 0.422 0.494 0.465 0.383 0.789
## Item.11 600 0.983 0.128 0.181 0.156 0.797
## Item.12 600 0.415 0.493 0.173 0.076 0.803
## Item.13 600 0.662 0.474 0.438 0.358 0.790
## Item.14 600 0.723 0.448 0.411 0.333 0.791
## Item.15 600 0.817 0.387 0.393 0.325 0.792
## Item.16 600 0.413 0.493 0.367 0.278 0.794
## Item.17 600 0.963 0.188 0.238 0.202 0.796
## Item.18 600 0.352 0.478 0.576 0.508 0.783
## Item.19 600 0.548 0.498 0.401 0.314 0.792
## Item.20 600 0.873 0.333 0.376 0.318 0.792
## Item.21 600 0.915 0.279 0.190 0.136 0.798
## Item.22 600 0.935 0.247 0.284 0.238 0.795
## Item.23 600 0.313 0.464 0.338 0.253 0.795
## Item.24 600 0.728 0.445 0.422 0.346 0.791
## Item.25 600 0.375 0.485 0.383 0.297 0.793
## Item.26 600 0.460 0.499 0.562 0.489 0.783
## Item.27 600 0.862 0.346 0.425 0.367 0.791
## Item.28 600 0.530 0.500 0.465 0.383 0.789
## Item.29 600 0.340 0.474 0.407 0.324 0.791
## Item.30 600 0.440 0.497 0.255 0.159 0.799
## Item.31 600 0.833 0.373 0.479 0.419 0.788
## Item.32 600 0.162 0.368 0.110 0.037 0.802
##
## $proportions
## 0 1
## Item.1 0.717 0.283
## Item.2 0.432 0.568
## Item.3 0.720 0.280
## Item.4 0.622 0.378
## Item.5 0.380 0.620
## Item.6 0.840 0.160
## Item.7 0.240 0.760
## Item.8 0.798 0.202
## Item.9 0.115 0.885
## Item.10 0.578 0.422
## Item.11 0.017 0.983
## Item.12 0.585 0.415
## Item.13 0.338 0.662
## Item.14 0.277 0.723
## Item.15 0.183 0.817
## Item.16 0.587 0.413
## Item.17 0.037 0.963
## Item.18 0.648 0.352
## Item.19 0.452 0.548
## Item.20 0.127 0.873
## Item.21 0.085 0.915
## Item.22 0.065 0.935
## Item.23 0.687 0.313
## Item.24 0.272 0.728
## Item.25 0.625 0.375
## Item.26 0.540 0.460
## Item.27 0.138 0.862
## Item.28 0.470 0.530
## Item.29 0.660 0.340
## Item.30 0.560 0.440
## Item.31 0.167 0.833
## Item.32 0.838 0.162
mod1 <- mirt(data, 1)
extract.mirt(mod1, 'time') #time elapsed for each estimation component
## TOTAL: Data Estep Mstep SE Post
## 0.362 0.052 0.077 0.222 0.000 0.000
# optionally use Newton-Raphson for (generally) faster convergence in the M-step's
mod1 <- mirt(data, 1, optimizer = 'NR')
extract.mirt(mod1, 'time')
## TOTAL: Data Estep Mstep SE Post
## 0.182 0.049 0.050 0.070 0.000 0.000
mod2 <- mirt(data, 2, optimizer = 'NR')
## EM cycles terminated after 500 iterations.
# difficulty converging with reduced quadpts, reduce TOL
mod3 <- mirt(data, 3, TOL = .001, optimizer = 'NR')
anova(mod1,mod2)
## AIC SABIC HQ BIC logLik X2 df p
## mod1 19105.91 19184.13 19215.46 19387.31 -9488.955
## mod2 19073.92 19190.03 19236.53 19491.63 -9441.963 93.985 31 0
anova(mod2, mod3) #negative AIC, 2 factors probably best
## AIC SABIC HQ BIC logLik X2 df p
## mod2 19073.92 19190.03 19236.53 19491.63 -9441.963
## mod3 19080.18 19232.96 19294.13 19629.80 -9415.090 53.744 30 0.005
# same as above, but using the QMCEM method for generally better accuracy in mod3
mod3 <- mirt(data, 3, method = 'QMCEM', TOL = .001, optimizer = 'NR')
anova(mod2, mod3)
## AIC SABIC HQ BIC logLik X2 df p
## mod2 19073.92 19190.03 19236.53 19491.63 -9441.963
## mod3 19081.58 19234.36 19295.54 19631.20 -9415.792 52.342 30 0.007
# with fixed guessing parameters
mod1g <- mirt(data, 1, guess = .1)
coef(mod1g)
## $Item.1
## a1 d g u
## par 1.211 -1.737 0.1 1
##
## $Item.2
## a1 d g u
## par 1.78 0.147 0.1 1
##
## $Item.3
## a1 d g u
## par 1.91 -2.2 0.1 1
##
## $Item.4
## a1 d g u
## par 0.833 -0.944 0.1 1
##
## $Item.5
## a1 d g u
## par 1.089 0.399 0.1 1
##
## $Item.6
## a1 d g u
## par 3.265 -5.212 0.1 1
##
## $Item.7
## a1 d g u
## par 1.02 1.224 0.1 1
##
## $Item.8
## a1 d g u
## par 1.639 -2.977 0.1 1
##
## $Item.9
## a1 d g u
## par 0.49 2.007 0.1 1
##
## $Item.10
## a1 d g u
## par 1.257 -0.756 0.1 1
##
## $Item.11
## a1 d g u
## par 1.68 5.18 0.1 1
##
## $Item.12
## a1 d g u
## par 0.191 -0.625 0.1 1
##
## $Item.13
## a1 d g u
## par 1.147 0.654 0.1 1
##
## $Item.14
## a1 d g u
## par 1.099 1.008 0.1 1
##
## $Item.15
## a1 d g u
## par 1.337 1.79 0.1 1
##
## $Item.16
## a1 d g u
## par 0.923 -0.744 0.1 1
##
## $Item.17
## a1 d g u
## par 1.519 4.077 0.1 1
##
## $Item.18
## a1 d g u
## par 2.585 -1.749 0.1 1
##
## $Item.19
## a1 d g u
## par 0.91 -0.002 0.1 1
##
## $Item.20
## a1 d g u
## par 1.485 2.438 0.1 1
##
## $Item.21
## a1 d g u
## par 0.616 2.407 0.1 1
##
## $Item.22
## a1 d g u
## par 1.429 3.291 0.1 1
##
## $Item.23
## a1 d g u
## par 0.96 -1.393 0.1 1
##
## $Item.24
## a1 d g u
## par 1.282 1.099 0.1 1
##
## $Item.25
## a1 d g u
## par 1.028 -1 0.1 1
##
## $Item.26
## a1 d g u
## par 2.059 -0.658 0.1 1
##
## $Item.27
## a1 d g u
## par 1.839 2.564 0.1 1
##
## $Item.28
## a1 d g u
## par 1.222 -0.095 0.1 1
##
## $Item.29
## a1 d g u
## par 1.281 -1.357 0.1 1
##
## $Item.30
## a1 d g u
## par 0.444 -0.521 0.1 1
##
## $Item.31
## a1 d g u
## par 2.476 2.697 0.1 1
##
## $Item.32
## a1 d g u
## par 0.461 -2.742 0.1 1
##
## $GroupPars
## MEAN_1 COV_11
## par 0 1
###########
# graded rating scale example
# make some data
set.seed(1234)
a <- matrix(rep(1, 10))
d <- matrix(c(1,0.5,-.5,-1), 10, 4, byrow = TRUE)
c <- seq(-1, 1, length.out=10)
data <- simdata(a, d + c, 2000, itemtype = rep('graded',10))
itemstats(data)
## $overall
## N mean_total.score sd_total.score ave.r sd.r alpha
## 2000 20.196 8.33 0.203 0.027 0.719
##
## $itemstats
## N mean sd total.r total.r_if_rm alpha_if_rm
## Item_1 2000 1.284 1.510 0.512 0.359 0.700
## Item_2 2000 1.427 1.544 0.529 0.375 0.697
## Item_3 2000 1.592 1.584 0.545 0.389 0.695
## Item_4 2000 1.774 1.586 0.538 0.381 0.696
## Item_5 2000 1.910 1.607 0.539 0.380 0.696
## Item_6 2000 2.124 1.606 0.533 0.373 0.697
## Item_7 2000 2.284 1.598 0.520 0.359 0.700
## Item_8 2000 2.420 1.583 0.578 0.430 0.688
## Item_9 2000 2.606 1.543 0.530 0.377 0.697
## Item_10 2000 2.776 1.491 0.495 0.342 0.702
##
## $proportions
## 0 1 2 3 4
## Item_1 0.500 0.096 0.182 0.065 0.158
## Item_2 0.450 0.108 0.197 0.059 0.187
## Item_3 0.407 0.108 0.182 0.092 0.212
## Item_4 0.346 0.111 0.212 0.085 0.246
## Item_5 0.319 0.102 0.211 0.086 0.281
## Item_6 0.269 0.097 0.205 0.099 0.330
## Item_7 0.244 0.073 0.211 0.101 0.372
## Item_8 0.216 0.074 0.195 0.106 0.410
## Item_9 0.175 0.072 0.196 0.083 0.473
## Item_10 0.150 0.059 0.174 0.102 0.516
mod1 <- mirt(data, 1)
mod2 <- mirt(data, 1, itemtype = 'grsm')
coef(mod2)
## $Item_1
## a1 b1 b2 b3 b4 c
## par 0.959 0.001 -0.507 -1.541 -2.032 0
##
## $Item_2
## a1 b1 b2 b3 b4 c
## par 0.987 0.001 -0.507 -1.541 -2.032 0.235
##
## $Item_3
## a1 b1 b2 b3 b4 c
## par 0.994 0.001 -0.507 -1.541 -2.032 0.457
##
## $Item_4
## a1 b1 b2 b3 b4 c
## par 1.027 0.001 -0.507 -1.541 -2.032 0.728
##
## $Item_5
## a1 b1 b2 b3 b4 c
## par 0.995 0.001 -0.507 -1.541 -2.032 0.895
##
## $Item_6
## a1 b1 b2 b3 b4 c
## par 0.987 0.001 -0.507 -1.541 -2.032 1.179
##
## $Item_7
## a1 b1 b2 b3 b4 c
## par 0.957 0.001 -0.507 -1.541 -2.032 1.404
##
## $Item_8
## a1 b1 b2 b3 b4 c
## par 1.04 0.001 -0.507 -1.541 -2.032 1.578
##
## $Item_9
## a1 b1 b2 b3 b4 c
## par 0.964 0.001 -0.507 -1.541 -2.032 1.878
##
## $Item_10
## a1 b1 b2 b3 b4 c
## par 0.947 0.001 -0.507 -1.541 -2.032 2.136
##
## $GroupPars
## MEAN_1 COV_11
## par 0 1
anova(mod2, mod1) #not sig, mod2 should be preferred
## AIC SABIC HQ BIC logLik X2 df p
## mod2 55239.72 55295.47 55287.03 55368.55 -27596.86
## mod1 55252.05 55373.25 55354.88 55532.10 -27576.03 41.671 27 0.035
itemplot(mod2, 1)
itemplot(mod2, 5)
itemplot(mod2, 10)
###########
# 2PL nominal response model example (Suh and Bolt, 2010)
data(SAT12)
SAT12[SAT12 == 8] <- NA #set 8 as a missing value
head(SAT12)
## Item.1 Item.2 Item.3 Item.4 Item.5 Item.6 Item.7 Item.8 Item.9 Item.10
## 1 1 4 5 2 3 1 2 1 3 1
## 2 3 4 2 NA 3 3 2 NA 3 1
## 3 1 4 5 4 3 2 2 3 3 2
## 4 2 4 4 2 3 3 2 4 3 2
## 5 2 4 5 2 3 2 2 1 1 2
## 6 1 4 3 1 3 2 2 3 3 1
## Item.11 Item.12 Item.13 Item.14 Item.15 Item.16 Item.17 Item.18 Item.19
## 1 2 4 2 1 5 3 4 4 1
## 2 2 NA 2 1 5 2 4 1 1
## 3 2 1 3 1 5 5 4 1 3
## 4 2 4 2 1 5 2 4 1 3
## 5 2 4 2 1 5 4 4 5 1
## 6 2 3 2 1 5 5 4 4 1
## Item.20 Item.21 Item.22 Item.23 Item.24 Item.25 Item.26 Item.27 Item.28
## 1 4 3 3 4 1 3 5 1 3
## 2 4 3 3 NA 1 NA 4 1 4
## 3 4 3 3 1 1 3 4 1 3
## 4 4 3 1 5 2 5 4 1 3
## 5 4 3 3 3 1 1 5 1 3
## 6 4 3 3 4 1 1 4 1 4
## Item.29 Item.30 Item.31 Item.32
## 1 1 5 4 5
## 2 5 NA 4 NA
## 3 4 4 4 1
## 4 4 2 4 2
## 5 1 2 4 1
## 6 2 3 4 3
# correct answer key
key <- c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5)
scoredSAT12 <- key2binary(SAT12, key)
mod0 <- mirt(scoredSAT12, 1)
# for first 5 items use 2PLNRM and nominal
scoredSAT12[,1:5] <- as.matrix(SAT12[,1:5])
mod1 <- mirt(scoredSAT12, 1, c(rep('nominal',5),rep('2PL', 27)))
mod2 <- mirt(scoredSAT12, 1, c(rep('2PLNRM',5),rep('2PL', 27)), key=key)
coef(mod0)$Item.1
## a1 d g u
## par 0.8107167 -1.042366 0 1
coef(mod1)$Item.1
## a1 ak0 ak1 ak2 ak3 ak4 d0 d1 d2
## par -0.8773 0 0.5285937 1.116549 1.129355 4 0 -0.1909842 0.01877757
## d3 d4
## par -0.1258587 -5.652548
coef(mod2)$Item.1
## a1 d g u ak0 ak1 ak2 ak3 d0 d1
## par 0.8102548 -1.04233 0 1 0 -0.5653287 -0.5712706 -3.025613 0 0.2117761
## d2 d3
## par 0.06919723 -5.309272
itemplot(mod0, 1)
itemplot(mod1, 1)
itemplot(mod2, 1)
# compare added information from distractors
Theta <- matrix(seq(-4,4,.01))
par(mfrow = c(2,3))
for(i in 1:5){
info <- iteminfo(extract.item(mod0,i), Theta)
info2 <- iteminfo(extract.item(mod2,i), Theta)
plot(Theta, info2, type = 'l', main = paste('Information for item', i), ylab = 'Information')
lines(Theta, info, col = 'red')
}
par(mfrow = c(1,1))
# test information
plot(Theta, testinfo(mod2, Theta), type = 'l', main = 'Test information', ylab = 'Information')
lines(Theta, testinfo(mod0, Theta), col = 'red')
###########
# using the MH-RM algorithm
data(LSAT7)
fulldata <- expand.table(LSAT7)
(mod1 <- mirt(fulldata, 1, method = 'MHRM'))
##
## Call:
## mirt(data = fulldata, model = 1, method = "MHRM")
##
## Full-information item factor analysis with 1 factor(s).
## Converged within 0.001 tolerance after 73 MHRM iterations.
## mirt version: 1.40
## M-step optimizer: NR1
## Latent density type: Gaussian
## Average MH acceptance ratio(s): 0.4
##
## Log-likelihood = -2659.08, SE = 0.018
## Estimated parameters: 10
## AIC = 5338.16
## BIC = 5387.237; SABIC = 5355.477
## G2 (21) = 32.1, p = 0.0572
## RMSEA = 0.023, CFI = NaN, TLI = NaN
# Confirmatory models
# simulate data
a <- matrix(c(
1.5,NA,
0.5,NA,
1.0,NA,
1.0,0.5,
NA,1.5,
NA,0.5,
NA,1.0,
NA,1.0),ncol=2,byrow=TRUE)
d <- matrix(c(
-1.0,NA,NA,
-1.5,NA,NA,
1.5,NA,NA,
0.0,NA,NA,
3.0,2.0,-0.5,
2.5,1.0,-1,
2.0,0.0,NA,
1.0,NA,NA),ncol=3,byrow=TRUE)
sigma <- diag(2)
sigma[1,2] <- sigma[2,1] <- .4
items <- c(rep('2PL',4), rep('graded',3), '2PL')
dataset <- simdata(a,d,2000,items,sigma)
# analyses
# CIFA for 2 factor crossed structure
model.1 <- '
F1 = 1-4
F2 = 4-8
COV = F1*F2'
# compute model, and use parallel computation of the log-likelihood
if(interactive()) mirtCluster()
mod1 <- mirt(dataset, model.1, method = 'MHRM')
coef(mod1)
## $Item_1
## a1 a2 d g u
## par 1.254 0 -0.968 0 1
##
## $Item_2
## a1 a2 d g u
## par 0.432 0 -1.563 0 1
##
## $Item_3
## a1 a2 d g u
## par 1.151 0 1.462 0 1
##
## $Item_4
## a1 a2 d g u
## par 0.968 0.556 0.096 0 1
##
## $Item_5
## a1 a2 d1 d2 d3
## par 0 1.82 3.193 2.104 -0.552
##
## $Item_6
## a1 a2 d1 d2 d3
## par 0 0.375 2.358 0.923 -0.967
##
## $Item_7
## a1 a2 d1 d2
## par 0 0.946 1.995 0.044
##
## $Item_8
## a1 a2 d g u
## par 0 0.836 0.853 0 1
##
## $GroupPars
## MEAN_1 MEAN_2 COV_11 COV_21 COV_22
## par 0 0 1 0.492 1
summary(mod1)
## F1 F2 h2
## Item_1 0.593 0.000 0.3519
## Item_2 0.246 0.000 0.0606
## Item_3 0.560 0.000 0.3137
## Item_4 0.476 0.273 0.3008
## Item_5 0.000 0.730 0.5333
## Item_6 0.000 0.215 0.0464
## Item_7 0.000 0.486 0.2358
## Item_8 0.000 0.441 0.1944
##
## SS loadings: 0.952 1.085
## Proportion Var: 0.119 0.136
##
## Factor correlations:
##
## F1 F2
## F1 1.000
## F2 0.492 1
residuals(mod1)
## LD matrix (lower triangle) and standardized values.
##
## Upper triangle summary:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.053 -0.027 0.000 -0.002 0.023 0.055
##
## Item_1 Item_2 Item_3 Item_4 Item_5 Item_6 Item_7 Item_8
## Item_1 NA 0.005 -0.002 0.002 -0.027 0.041 -0.027 0.017
## Item_2 0.044 NA 0.004 -0.017 0.022 -0.053 0.037 0.029
## Item_3 0.009 0.030 NA -0.002 0.055 -0.046 -0.029 0.013
## Item_4 0.007 0.605 0.007 NA -0.011 0.005 -0.038 -0.020
## Item_5 1.463 0.928 6.054 0.238 NA -0.028 -0.040 -0.037
## Item_6 3.302 5.623 4.150 0.050 4.780 NA 0.043 0.029
## Item_7 1.472 2.775 1.651 2.959 6.416 7.307 NA 0.029
## Item_8 0.596 1.716 0.314 0.840 2.809 1.695 1.648 NA
#####
# bifactor
model.3 <- '
G = 1-8
F1 = 1-4
F2 = 5-8'
mod3 <- mirt(dataset,model.3, method = 'MHRM')
coef(mod3)
## $Item_1
## a1 a2 a3 d g u
## par 0.784 1.341 0 -1.06 0 1
##
## $Item_2
## a1 a2 a3 d g u
## par 0.334 0.237 0 -1.556 0 1
##
## $Item_3
## a1 a2 a3 d g u
## par 0.669 0.754 0 1.404 0 1
##
## $Item_4
## a1 a2 a3 d g u
## par 1.171 0.647 0 0.1 0 1
##
## $Item_5
## a1 a2 a3 d1 d2 d3
## par 1.533 0 0.757 3.094 2.038 -0.529
##
## $Item_6
## a1 a2 a3 d1 d2 d3
## par 0.249 0 0.326 2.369 0.929 -0.972
##
## $Item_7
## a1 a2 a3 d1 d2
## par 0.783 0 0.753 2.076 0.049
##
## $Item_8
## a1 a2 a3 d g u
## par 0.703 0 0.58 0.873 0 1
##
## $GroupPars
## MEAN_1 MEAN_2 MEAN_3 COV_11 COV_21 COV_31 COV_22 COV_32 COV_33
## par 0 0 0 1 0 0 1 0 1
summary(mod3)
## G F1 F2 h2
## Item_1 0.340 0.582 0.000 0.4546
## Item_2 0.191 0.135 0.000 0.0547
## Item_3 0.338 0.381 0.000 0.2597
## Item_4 0.541 0.299 0.000 0.3819
## Item_5 0.636 0.000 0.314 0.5024
## Item_6 0.142 0.000 0.186 0.0548
## Item_7 0.388 0.000 0.373 0.2895
## Item_8 0.364 0.000 0.300 0.2227
##
## SS loadings: 1.266 0.592 0.362
## Proportion Var: 0.158 0.074 0.045
##
## Factor correlations:
##
## G F1 F2
## G 1
## F1 0 1
## F2 0 0 1
residuals(mod3)
## LD matrix (lower triangle) and standardized values.
##
## Upper triangle summary:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -0.054 -0.027 0.001 -0.001 0.021 0.053
##
## Item_1 Item_2 Item_3 Item_4 Item_5 Item_6 Item_7 Item_8
## Item_1 NA 0.007 -0.004 -0.010 -0.028 0.045 -0.021 0.021
## Item_2 0.085 NA 0.014 -0.019 -0.016 -0.054 0.031 0.021
## Item_3 0.033 0.365 NA 0.015 0.053 -0.038 -0.027 0.012
## Item_4 0.188 0.700 0.422 NA 0.012 0.022 0.035 -0.012
## Item_5 1.526 0.539 5.692 0.292 NA -0.028 -0.040 -0.038
## Item_6 4.005 5.742 2.931 1.011 4.548 NA 0.040 -0.028
## Item_7 0.890 1.897 1.434 2.390 6.321 6.251 NA 0.008
## Item_8 0.857 0.920 0.281 0.267 2.890 1.558 0.114 NA
anova(mod1,mod3)
## AIC SABIC HQ BIC logLik X2 df p
## mod1 24973.84 25029.58 25021.14 25102.66 -12463.92
## mod3 24979.27 25049.56 25038.91 25141.69 -12460.64 6.566 6 0.363
#####
# polynomial/combinations
data(SAT12)
data <- key2binary(SAT12,
key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))
model.quad <- '
F1 = 1-32
(F1*F1) = 1-32'
model.combo <- '
F1 = 1-16
F2 = 17-32
(F1*F2) = 1-8'
(mod.quad <- mirt(data, model.quad))
## EM cycles terminated after 500 iterations.
##
## Call:
## mirt(data = data, model = model.quad)
##
## Full-information item factor analysis with 1 factor(s).
## FAILED TO CONVERGE within 1e-04 tolerance after 500 EM iterations.
## mirt version: 1.40
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Log-likelihood = -9464.029
## Estimated parameters: 96
## AIC = 19120.06
## BIC = 19542.16; SABIC = 19237.39
## G2 (4294967199) = 11258.34, p = 1
## RMSEA = 0, CFI = NaN, TLI = NaN
summary(mod.quad)
## F1 (F1*F1) h2
## Item.1 0.1909 0.3417 0.1532
## Item.2 0.5154 0.5475 0.5654
## Item.3 0.3784 0.3811 0.2884
## Item.4 0.1330 0.3300 0.1266
## Item.5 0.3933 0.3950 0.3107
## Item.6 0.3102 0.3921 0.2500
## Item.7 0.7031 0.3334 0.6055
## Item.8 0.1213 0.3248 0.1202
## Item.9 0.3072 0.1632 0.1210
## Item.10 0.4055 0.3388 0.2792
## Item.11 0.7867 0.5868 0.9632
## Item.12 0.0160 0.1053 0.0113
## Item.13 0.5065 0.3648 0.3896
## Item.14 0.3389 0.5612 0.4298
## Item.15 0.8126 0.4134 0.8312
## Item.16 0.1443 0.3799 0.1651
## Item.17 0.8581 0.4511 0.9398
## Item.18 0.5088 0.5139 0.5230
## Item.19 0.3068 0.3475 0.2149
## Item.20 0.5999 0.6378 0.7667
## Item.21 0.4551 0.2015 0.2477
## Item.22 0.7444 0.5370 0.8426
## Item.23 0.1293 0.3150 0.1159
## Item.24 0.6664 0.4234 0.6234
## Item.25 0.0474 0.4498 0.2046
## Item.26 0.3901 0.6232 0.5405
## Item.27 0.7282 0.5660 0.8507
## Item.28 0.3625 0.4055 0.2959
## Item.29 0.1756 0.4144 0.2026
## Item.30 0.2559 0.0881 0.0733
## Item.31 0.7106 0.6578 0.9377
## Item.32 0.1359 0.0643 0.0226
##
## SS loadings: 7.261 5.752
## Proportion Var: 0.227 0.18
##
## Factor correlations:
##
## F1
## F1 1
(mod.combo <- mirt(data, model.combo))
##
## Call:
## mirt(data = data, model = model.combo)
##
## Full-information item factor analysis with 2 factor(s).
## Converged within 1e-04 tolerance after 20 EM iterations.
## mirt version: 1.40
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 31
## Latent density type: Gaussian
##
## Log-likelihood = -9655.028
## Estimated parameters: 72
## AIC = 19454.06
## BIC = 19770.63; SABIC = 19542.05
## G2 (4294967223) = 11640.33, p = 1
## RMSEA = 0, CFI = NaN, TLI = NaN
anova(mod.combo, mod.quad)
## AIC SABIC HQ BIC logLik X2 df p
## mod.combo 19454.06 19542.05 19577.29 19770.63 -9655.028
## mod.quad 19120.06 19237.39 19284.38 19542.16 -9464.029 381.996 24 0
# non-linear item and test plots
plot(mod.quad)
plot(mod.combo, type = 'SE')
itemplot(mod.quad, 1, type = 'score')
itemplot(mod.combo, 2, type = 'score')
itemplot(mod.combo, 2, type = 'infocontour')
## empirical histogram examples (normal, skew and bimodality)
# make some data
set.seed(1234)
a <- matrix(rlnorm(50, .2, .2))
d <- matrix(rnorm(50))
ThetaNormal <- matrix(rnorm(2000))
ThetaBimodal <- scale(matrix(c(rnorm(1000, -2), rnorm(1000,2)))) #bimodal
ThetaSkew <- scale(matrix(rchisq(2000, 3))) #positive skew
datNormal <- simdata(a, d, 2000, itemtype = '2PL', Theta=ThetaNormal)
datBimodal <- simdata(a, d, 2000, itemtype = '2PL', Theta=ThetaBimodal)
datSkew <- simdata(a, d, 2000, itemtype = '2PL', Theta=ThetaSkew)
normal <- mirt(datNormal, 1, dentype = "empiricalhist")
plot(normal, type = 'empiricalhist')
histogram(ThetaNormal, breaks=30)
bimodal <- mirt(datBimodal, 1, dentype = "empiricalhist")
plot(bimodal, type = 'empiricalhist')
histogram(ThetaBimodal, breaks=30)
skew <- mirt(datSkew, 1, dentype = "empiricalhist")
plot(skew, type = 'empiricalhist')
histogram(ThetaSkew, breaks=30)
#####
# non-linear parameter constraints with Rsolnp package (nloptr supported as well):
# Find Rasch model subject to the constraint that the intercepts sum to 0
dat <- expand.table(LSAT6)
itemstats(dat)
## $overall
## N mean_total.score sd_total.score ave.r sd.r alpha
## 1000 3.819 1.035 0.077 0.03 0.295
##
## $itemstats
## N mean sd total.r total.r_if_rm alpha_if_rm
## Item_1 1000 0.924 0.265 0.362 0.113 0.275
## Item_2 1000 0.709 0.454 0.567 0.153 0.238
## Item_3 1000 0.553 0.497 0.618 0.173 0.217
## Item_4 1000 0.763 0.425 0.534 0.144 0.246
## Item_5 1000 0.870 0.336 0.435 0.122 0.266
##
## $proportions
## 0 1
## Item_1 0.076 0.924
## Item_2 0.291 0.709
## Item_3 0.447 0.553
## Item_4 0.237 0.763
## Item_5 0.130 0.870
# free latent mean and variance terms
model <- 'Theta = 1-5
MEAN = Theta
COV = Theta*Theta'
# view how vector of parameters is organized internally
sv <- mirt(dat, model, itemtype = 'Rasch', pars = 'values')
sv[sv$est, ]
## group item class name parnum value lbound ubound est prior.type
## 2 all Item_1 dich d 2 2.8152981 -Inf Inf TRUE none
## 6 all Item_2 dich d 6 1.0818304 -Inf Inf TRUE none
## 10 all Item_3 dich d 10 0.2618655 -Inf Inf TRUE none
## 14 all Item_4 dich d 14 1.4071275 -Inf Inf TRUE none
## 18 all Item_5 dich d 18 2.2136968 -Inf Inf TRUE none
## 21 all GROUP GroupPars MEAN_1 21 0.0000000 -Inf Inf TRUE none
## 22 all GROUP GroupPars COV_11 22 1.0000000 1e-04 Inf TRUE none
## prior_1 prior_2
## 2 NaN NaN
## 6 NaN NaN
## 10 NaN NaN
## 14 NaN NaN
## 18 NaN NaN
## 21 NaN NaN
## 22 NaN NaN
# constraint: create function for solnp to compute constraint, and declare value in eqB
eqfun <- function(p, optim_args) sum(p[1:5]) #could use browser() here, if it helps
LB <- c(rep(-15, 6), 1e-4) # more reasonable lower bound for variance term
mod <- mirt(dat, model, sv=sv, itemtype = 'Rasch', optimizer = 'solnp',
solnp_args=list(eqfun=eqfun, eqB=0, LB=LB))
print(mod)
##
## Call:
## mirt(data = dat, model = model, itemtype = "Rasch", optimizer = "solnp",
## solnp_args = list(eqfun = eqfun, eqB = 0, LB = LB), sv = sv)
##
## Full-information item factor analysis with 1 factor(s).
## Converged within 1e-04 tolerance after 34 EM iterations.
## mirt version: 1.40
## M-step optimizer: solnp
## EM acceleration: Ramsay
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Log-likelihood = -2466.943
## Estimated parameters: 7
## AIC = 4947.887
## BIC = 4982.241; SABIC = 4960.009
## G2 (25) = 21.81, p = 0.6467
## RMSEA = 0, CFI = NaN, TLI = NaN
coef(mod)
## $Item_1
## a1 d g u
## par 1 1.253 0 1
##
## $Item_2
## a1 d g u
## par 1 -0.475 0 1
##
## $Item_3
## a1 d g u
## par 1 -1.233 0 1
##
## $Item_4
## a1 d g u
## par 1 -0.168 0 1
##
## $Item_5
## a1 d g u
## par 1 0.623 0 1
##
## $GroupPars
## MEAN_1 COV_11
## par 1.472 0.559
(ds <- sapply(coef(mod)[1:5], function(x) x[,'d']))
## Item_1 Item_2 Item_3 Item_4 Item_5
## 1.2529600 -0.4754484 -1.2327360 -0.1681705 0.6233949
sum(ds)
## [1] 4.551914e-15
# same likelihood location as: mirt(dat, 1, itemtype = 'Rasch')
#######
# latent regression Rasch model
# simulate data
set.seed(1234)
N <- 1000
# covariates
X1 <- rnorm(N); X2 <- rnorm(N)
covdata <- data.frame(X1, X2)
Theta <- matrix(0.5 * X1 + -1 * X2 + rnorm(N, sd = 0.5))
# items and response data
a <- matrix(1, 20); d <- matrix(rnorm(20))
dat <- simdata(a, d, 1000, itemtype = '2PL', Theta=Theta)
# unconditional Rasch model
mod0 <- mirt(dat, 1, 'Rasch')
# conditional model using X1 and X2 as predictors of Theta
mod1 <- mirt(dat, 1, 'Rasch', covdata=covdata, formula = ~ X1 + X2)
coef(mod1, simplify=TRUE)
## $items
## a1 d g u
## Item_1 1 -0.409 0 1
## Item_2 1 0.491 0 1
## Item_3 1 0.313 0 1
## Item_4 1 1.965 0 1
## Item_5 1 1.753 0 1
## Item_6 1 -0.246 0 1
## Item_7 1 -1.077 0 1
## Item_8 1 0.533 0 1
## Item_9 1 -1.232 0 1
## Item_10 1 0.603 0 1
## Item_11 1 -0.404 0 1
## Item_12 1 1.238 0 1
## Item_13 1 1.033 0 1
## Item_14 1 1.524 0 1
## Item_15 1 -0.548 0 1
## Item_16 1 2.075 0 1
## Item_17 1 -0.695 0 1
## Item_18 1 -1.200 0 1
## Item_19 1 0.121 0 1
## Item_20 1 0.523 0 1
##
## $means
## F1
## 0
##
## $cov
## F1
## F1 0.215
##
## $lr.betas
## F1
## (Intercept) 0.000
## X1 0.527
## X2 -1.036
anova(mod0, mod1)
## AIC SABIC HQ BIC logLik X2 df p
## mod0 22246.88 22283.25 22286.06 22349.95 -11102.44
## mod1 21028.06 21067.89 21070.96 21140.94 -10491.03 1222.824 2 0
# bootstrapped confidence intervals
boot.mirt(mod1, R=5)
##
## ORDINARY NONPARAMETRIC BOOTSTRAP
##
##
## Call:
## boot.mirt(x = mod1, R = 5)
##
##
## Bootstrap Statistics :
## original bias std. error
## t1* -0.4088935 0.0170682073 0.10757209
## t2* 0.4909630 0.0217407330 0.02500522
## t3* 0.3126422 0.0236496011 0.05634312
## t4* 1.9648322 0.0193659421 0.13606204
## t5* 1.7526211 -0.0455520674 0.11423101
## t6* -0.2460967 -0.0219137511 0.07868434
## t7* -1.0765218 0.0477122607 0.07798492
## t8* 0.5334115 0.0693270024 0.06221565
## t9* -1.2316515 0.0228112980 0.10239703
## t10* 0.6028956 0.0264900614 0.04916922
## t11* -0.4035988 -0.0239383377 0.07740224
## t12* 1.2376800 0.0166537013 0.09645754
## t13* 1.0329335 0.0157872957 0.14718882
## t14* 1.5237334 0.0387684021 0.08281241
## t15* -0.5478457 0.0005616874 0.03783416
## t16* 2.0750930 0.0770826881 0.09570574
## t17* -0.6953709 -0.0016625127 0.09420901
## t18* -1.2000275 -0.0246110382 0.06462390
## t19* 0.1210549 -0.0051398610 0.09276760
## t20* 0.5227785 0.0075559929 0.12634653
## t21* 0.2154905 -0.0178079178 0.02886265
## t22* 0.5265560 -0.0115324844 0.01713000
## t23* -1.0358089 0.0026048128 0.03126855
# draw plausible values for secondary analyses
pv <- fscores(mod1, plausible.draws = 10)
pvmods <- lapply(pv, function(x, covdata) lm(x ~ covdata$X1 + covdata$X2),
covdata=covdata)
# population characteristics recovered well, and can be averaged over
so <- lapply(pvmods, summary)
so
## [[1]]
##
## Call:
## lm(formula = x ~ covdata$X1 + covdata$X2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.65218 -0.29417 -0.01089 0.29842 1.68527
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.01902 0.01492 1.275 0.203
## covdata$X1 0.51111 0.01498 34.118 <2e-16 ***
## covdata$X2 -1.01823 0.01523 -66.869 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4715 on 997 degrees of freedom
## Multiple R-squared: 0.844, Adjusted R-squared: 0.8437
## F-statistic: 2697 on 2 and 997 DF, p-value: < 2.2e-16
##
##
## [[2]]
##
## Call:
## lm(formula = x ~ covdata$X1 + covdata$X2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.61627 -0.27923 0.00911 0.31364 1.64226
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.002276 0.014734 -0.154 0.877
## covdata$X1 0.534097 0.014798 36.094 <2e-16 ***
## covdata$X2 -1.023914 0.015041 -68.075 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4657 on 997 degrees of freedom
## Multiple R-squared: 0.8506, Adjusted R-squared: 0.8503
## F-statistic: 2838 on 2 and 997 DF, p-value: < 2.2e-16
##
##
## [[3]]
##
## Call:
## lm(formula = x ~ covdata$X1 + covdata$X2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.4279 -0.3262 -0.0127 0.3295 1.3146
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.005429 0.014503 0.374 0.708
## covdata$X1 0.520848 0.014565 35.760 <2e-16 ***
## covdata$X2 -1.032110 0.014805 -69.715 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4584 on 997 degrees of freedom
## Multiple R-squared: 0.8549, Adjusted R-squared: 0.8546
## F-statistic: 2938 on 2 and 997 DF, p-value: < 2.2e-16
##
##
## [[4]]
##
## Call:
## lm(formula = x ~ covdata$X1 + covdata$X2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.26106 -0.32445 0.01331 0.30410 1.85158
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.01529 0.01421 1.076 0.282
## covdata$X1 0.50907 0.01427 35.665 <2e-16 ***
## covdata$X2 -1.04927 0.01451 -72.320 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4492 on 997 degrees of freedom
## Multiple R-squared: 0.862, Adjusted R-squared: 0.8618
## F-statistic: 3115 on 2 and 997 DF, p-value: < 2.2e-16
##
##
## [[5]]
##
## Call:
## lm(formula = x ~ covdata$X1 + covdata$X2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.58631 -0.33425 -0.00987 0.29985 1.81571
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.01249 0.01513 0.825 0.409
## covdata$X1 0.52510 0.01520 34.551 <2e-16 ***
## covdata$X2 -1.03446 0.01545 -66.964 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4783 on 997 degrees of freedom
## Multiple R-squared: 0.845, Adjusted R-squared: 0.8446
## F-statistic: 2717 on 2 and 997 DF, p-value: < 2.2e-16
##
##
## [[6]]
##
## Call:
## lm(formula = x ~ covdata$X1 + covdata$X2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.5027 -0.3141 -0.0173 0.3090 1.5935
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.02112 0.01465 1.442 0.15
## covdata$X1 0.51405 0.01471 34.939 <2e-16 ***
## covdata$X2 -1.03108 0.01495 -68.946 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.463 on 997 degrees of freedom
## Multiple R-squared: 0.8516, Adjusted R-squared: 0.8513
## F-statistic: 2860 on 2 and 997 DF, p-value: < 2.2e-16
##
##
## [[7]]
##
## Call:
## lm(formula = x ~ covdata$X1 + covdata$X2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.3220 -0.3103 -0.0136 0.2995 1.3248
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.02001 0.01453 1.377 0.169
## covdata$X1 0.53710 0.01459 36.806 <2e-16 ***
## covdata$X2 -1.03760 0.01483 -69.953 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4593 on 997 degrees of freedom
## Multiple R-squared: 0.857, Adjusted R-squared: 0.8567
## F-statistic: 2988 on 2 and 997 DF, p-value: < 2.2e-16
##
##
## [[8]]
##
## Call:
## lm(formula = x ~ covdata$X1 + covdata$X2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.25005 -0.30923 -0.02161 0.30256 1.46965
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.01326 0.01433 -0.926 0.355
## covdata$X1 0.53552 0.01439 37.218 <2e-16 ***
## covdata$X2 -1.01942 0.01463 -69.701 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4528 on 997 degrees of freedom
## Multiple R-squared: 0.8569, Adjusted R-squared: 0.8566
## F-statistic: 2984 on 2 and 997 DF, p-value: < 2.2e-16
##
##
## [[9]]
##
## Call:
## lm(formula = x ~ covdata$X1 + covdata$X2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.47664 -0.30984 0.00685 0.30286 1.42925
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.006244 0.014698 -0.425 0.671
## covdata$X1 0.509733 0.014761 34.533 <2e-16 ***
## covdata$X2 -1.040088 0.015003 -69.323 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4645 on 997 degrees of freedom
## Multiple R-squared: 0.8521, Adjusted R-squared: 0.8518
## F-statistic: 2873 on 2 and 997 DF, p-value: < 2.2e-16
##
##
## [[10]]
##
## Call:
## lm(formula = x ~ covdata$X1 + covdata$X2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.44314 -0.32972 -0.01162 0.30826 1.61043
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.01103 0.01497 -0.737 0.461
## covdata$X1 0.53126 0.01504 35.329 <2e-16 ***
## covdata$X2 -1.01088 0.01529 -66.135 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.4733 on 997 degrees of freedom
## Multiple R-squared: 0.8435, Adjusted R-squared: 0.8432
## F-statistic: 2687 on 2 and 997 DF, p-value: < 2.2e-16
# compute Rubin's multiple imputation average
par <- lapply(so, function(x) x$coefficients[, 'Estimate'])
SEpar <- lapply(so, function(x) x$coefficients[, 'Std. Error'])
averageMI(par, SEpar)
## par SEpar t df p
## (Intercept) 0.006 0.020 0.299 39.635 0.192
## covdata$X1 0.523 0.019 27.657 58.425 0
## covdata$X2 -1.030 0.019 -53.389 57.078 0
############
# Example using Gauss-Hermite quadrature with custom input functions
library(fastGHQuad)
## Loading required package: Rcpp
data(SAT12)
data <- key2binary(SAT12,
key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))
GH <- gaussHermiteData(50)
Theta <- matrix(GH$x)
# This prior works for uni- and multi-dimensional models
prior <- function(Theta, Etable){
P <- grid <- GH$w / sqrt(pi)
if(ncol(Theta) > 1)
for(i in 2:ncol(Theta))
P <- expand.grid(P, grid)
if(!is.vector(P)) P <- apply(P, 1, prod)
P
}
GHmod1 <- mirt(data, 1, optimizer = 'NR',
technical = list(customTheta = Theta, customPriorFun = prior))
coef(GHmod1, simplify=TRUE)
## $items
## a1 d g u
## Item.1 1.133 -1.045 0 1
## Item.2 2.124 0.438 0 1
## Item.3 1.518 -1.141 0 1
## Item.4 0.826 -0.530 0 1
## Item.5 1.399 0.606 0 1
## Item.6 1.623 -2.049 0 1
## Item.7 1.419 1.383 0 1
## Item.8 0.979 -1.508 0 1
## Item.9 0.750 2.143 0 1
## Item.10 1.424 -0.360 0 1
## Item.11 2.453 5.252 0 1
## Item.12 0.229 -0.345 0 1
## Item.13 1.559 0.851 0 1
## Item.14 1.465 1.174 0 1
## Item.15 1.829 1.925 0 1
## Item.16 1.027 -0.382 0 1
## Item.17 2.193 4.165 0 1
## Item.18 2.404 -0.851 0 1
## Item.19 1.186 0.237 0 1
## Item.20 2.173 2.610 0 1
## Item.21 0.857 2.518 0 1
## Item.22 2.178 3.479 0 1
## Item.23 0.901 -0.850 0 1
## Item.24 1.705 1.270 0 1
## Item.25 1.091 -0.567 0 1
## Item.26 2.169 -0.171 0 1
## Item.27 2.709 2.770 0 1
## Item.28 1.512 0.174 0 1
## Item.29 1.181 -0.750 0 1
## Item.30 0.546 -0.248 0 1
## Item.31 3.304 2.785 0 1
## Item.32 0.183 -1.652 0 1
##
## $means
## F1
## 0
##
## $cov
## F1
## F1 1
Theta2 <- as.matrix(expand.grid(Theta, Theta))
GHmod2 <- mirt(data, 2, optimizer = 'NR', TOL = .0002,
technical = list(customTheta = Theta2, customPriorFun = prior))
summary(GHmod2, suppress=.2)
##
## Rotation: oblimin
##
## Rotated factor loadings:
##
## F1 F2 h2
## Item.1 0.545 0.3355
## Item.2 0.349 0.526 0.6130
## Item.3 0.386 0.365 0.4464
## Item.4 0.592 0.2801
## Item.5 0.259 0.454 0.4094
## Item.6 0.588 0.4867
## Item.7 0.868 0.5965
## Item.8 0.384 0.2475
## Item.9 0.639 -0.216 0.2942
## Item.10 0.547 0.4467
## Item.11 0.724 0.6794
## Item.12 -0.233 0.367 0.0894
## Item.13 0.622 0.4972
## Item.14 0.703 0.5177
## Item.15 0.803 0.6245
## Item.16 0.544 0.3088
## Item.17 0.635 0.230 0.6267
## Item.18 0.489 0.429 0.6676
## Item.19 0.259 0.382 0.3279
## Item.20 0.381 0.506 0.6267
## Item.21 0.669 -0.207 0.3287
## Item.22 0.601 0.265 0.6164
## Item.23 -0.210 0.698 0.3604
## Item.24 0.637 0.5324
## Item.25 0.715 0.4106
## Item.26 0.675 0.6471
## Item.27 0.680 0.251 0.7243
## Item.28 0.318 0.428 0.4427
## Item.29 0.601 0.3728
## Item.30 0.282 0.1058
## Item.31 0.428 0.572 0.7958
## Item.32 0.0124
##
## Rotated SS loadings: 6.485 6.008
##
## Factor correlations:
##
## F1 F2
## F1 1.000
## F2 0.583 1
############
# Davidian curve example
dat <- key2binary(SAT12,
key = c(1,4,5,2,3,1,2,1,3,1,2,4,2,1,5,3,4,4,1,4,3,3,4,1,3,5,1,3,1,5,4,5))
dav <- mirt(dat, 1, dentype = 'Davidian-4') # use four smoothing parameters
plot(dav, type = 'Davidian') # shape of latent trait distribution
coef(dav, simplify=TRUE)
## $items
## a1 d g u
## Item.1 0.773 -1.049 0 1
## Item.2 1.698 0.496 0 1
## Item.3 1.056 -1.131 0 1
## Item.4 0.591 -0.541 0 1
## Item.5 1.054 0.614 0 1
## Item.6 1.041 -2.023 0 1
## Item.7 1.103 1.396 0 1
## Item.8 0.652 -1.517 0 1
## Item.9 0.545 2.130 0 1
## Item.10 1.016 -0.353 0 1
## Item.11 2.106 5.430 0 1
## Item.12 0.165 -0.351 0 1
## Item.13 1.203 0.869 0 1
## Item.14 1.182 1.205 0 1
## Item.15 1.428 1.940 0 1
## Item.16 0.736 -0.388 0 1
## Item.17 1.850 4.266 0 1
## Item.18 1.766 -0.786 0 1
## Item.19 0.880 0.238 0 1
## Item.20 1.859 2.723 0 1
## Item.21 0.653 2.514 0 1
## Item.22 1.867 3.596 0 1
## Item.23 0.592 -0.855 0 1
## Item.24 1.361 1.304 0 1
## Item.25 0.742 -0.569 0 1
## Item.26 1.666 -0.121 0 1
## Item.27 2.331 2.925 0 1
## Item.28 1.091 0.181 0 1
## Item.29 0.811 -0.751 0 1
## Item.30 0.358 -0.257 0 1
## Item.31 2.947 3.057 0 1
## Item.32 0.178 -1.664 0 1
##
## $means
## F1
## 0
##
## $cov
## F1
## F1 1
##
## $Davidian_phis
## [1] 1.292 0.084 -0.448 1.243
fs <- fscores(dav) # assume normal prior
fs2 <- fscores(dav, use_dentype_estimate=TRUE) # use Davidian estimated prior shape
head(cbind(fs, fs2))
## F1 F1
## [1,] 2.669533776 3.597900305
## [2,] -0.006610903 -0.061281001
## [3,] 0.069728060 0.006894763
## [4,] -0.412043885 -0.423012837
## [5,] 0.669953895 0.560980681
## [6,] 0.448040211 0.349319732
itemfit(dav) # assume normal prior
## item S_X2 df.S_X2 RMSEA.S_X2 p.S_X2
## 1 Item.1 12.497 15 0.000 0.641
## 2 Item.2 13.571 11 0.020 0.258
## 3 Item.3 13.530 13 0.008 0.408
## 4 Item.4 28.858 16 0.037 0.025
## 5 Item.5 15.832 14 0.015 0.324
## 6 Item.6 15.468 12 0.022 0.217
## 7 Item.7 11.836 13 0.000 0.541
## 8 Item.8 23.748 14 0.034 0.049
## 9 Item.9 16.351 14 0.017 0.292
## 10 Item.10 15.347 14 0.013 0.355
## 11 Item.11 NaN 0 NaN NaN
## 12 Item.12 20.253 17 0.018 0.262
## 13 Item.13 22.881 12 0.039 0.029
## 14 Item.14 25.390 13 0.040 0.021
## 15 Item.15 19.336 12 0.032 0.081
## 16 Item.16 30.806 15 0.042 0.009
## 17 Item.17 14.287 7 0.042 0.046
## 18 Item.18 16.175 11 0.028 0.135
## 19 Item.19 20.637 14 0.028 0.111
## 20 Item.20 24.701 9 0.054 0.003
## 21 Item.21 25.583 13 0.040 0.019
## 22 Item.22 25.827 7 0.067 0.001
## 23 Item.23 15.944 15 0.010 0.386
## 24 Item.24 12.546 11 0.015 0.324
## 25 Item.25 42.759 15 0.056 0.000
## 26 Item.26 15.351 12 0.022 0.223
## 27 Item.27 7.887 8 0.000 0.445
## 28 Item.28 20.811 14 0.028 0.107
## 29 Item.29 18.062 15 0.018 0.259
## 30 Item.30 29.596 17 0.035 0.029
## 31 Item.31 13.868 7 0.040 0.054
## 32 Item.32 15.633 16 0.000 0.479
itemfit(dav, use_dentype_estimate=TRUE) # use Davidian estimated prior shape
## item S_X2 df.S_X2 RMSEA.S_X2 p.S_X2
## 1 Item.1 12.597 15 0.000 0.633
## 2 Item.2 12.672 11 0.016 0.315
## 3 Item.3 11.641 12 0.000 0.475
## 4 Item.4 30.536 16 0.039 0.015
## 5 Item.5 16.870 14 0.018 0.263
## 6 Item.6 14.090 12 0.017 0.295
## 7 Item.7 12.041 13 0.000 0.524
## 8 Item.8 24.325 14 0.035 0.042
## 9 Item.9 16.274 14 0.016 0.297
## 10 Item.10 14.593 13 0.014 0.333
## 11 Item.11 NaN 0 NaN NaN
## 12 Item.12 18.270 17 0.011 0.372
## 13 Item.13 22.897 12 0.039 0.029
## 14 Item.14 25.147 13 0.039 0.022
## 15 Item.15 19.873 12 0.033 0.070
## 16 Item.16 30.994 14 0.045 0.006
## 17 Item.17 14.838 7 0.043 0.038
## 18 Item.18 15.618 11 0.026 0.156
## 19 Item.19 20.596 14 0.028 0.112
## 20 Item.20 24.158 9 0.053 0.004
## 21 Item.21 15.163 11 0.025 0.175
## 22 Item.22 28.286 8 0.065 0.000
## 23 Item.23 16.309 15 0.012 0.362
## 24 Item.24 12.499 11 0.015 0.327
## 25 Item.25 44.846 15 0.058 0.000
## 26 Item.26 14.583 11 0.023 0.202
## 27 Item.27 7.679 8 0.000 0.465
## 28 Item.28 22.298 14 0.031 0.073
## 29 Item.29 18.642 14 0.024 0.179
## 30 Item.30 27.672 17 0.032 0.049
## 31 Item.31 13.611 8 0.034 0.092
## 32 Item.32 17.803 16 0.014 0.336
###########
# 5PL and restricted 5PL example
dat <- expand.table(LSAT7)
mod2PL <- mirt(dat)
mod2PL
##
## Call:
## mirt(data = dat)
##
## Full-information item factor analysis with 1 factor(s).
## Converged within 1e-04 tolerance after 28 EM iterations.
## mirt version: 1.40
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Log-likelihood = -2658.805
## Estimated parameters: 10
## AIC = 5337.61
## BIC = 5386.688; SABIC = 5354.927
## G2 (21) = 31.7, p = 0.0628
## RMSEA = 0.023, CFI = NaN, TLI = NaN
# Following does not converge without including strong priors
# mod5PL <- mirt(dat, itemtype = '5PL')
# mod5PL
# restricted version of 5PL (asymmetric 2PL)
model <- 'Theta = 1-5
FIXED = (1-5, g), (1-5, u)'
mod2PL_asym <- mirt(dat, model=model, itemtype = '5PL')
mod2PL_asym
##
## Call:
## mirt(data = dat, model = model, itemtype = "5PL")
##
## Full-information item factor analysis with 1 factor(s).
## Converged within 1e-04 tolerance after 232 EM iterations.
## mirt version: 1.40
## M-step optimizer: BFGS
## EM acceleration: Ramsay
## Number of rectangular quadrature: 61
## Latent density type: Gaussian
##
## Log-likelihood = -2657.872
## Estimated parameters: 15
## AIC = 5345.743
## BIC = 5419.36; SABIC = 5371.719
## G2 (16) = 29.83, p = 0.0189
## RMSEA = 0.029, CFI = NaN, TLI = NaN
coef(mod2PL_asym, simplify=TRUE)
## $items
## a1 d g u logS
## Item.1 0.923 2.975 0 1 1.052
## Item.2 2.290 -1.769 0 1 -1.547
## Item.3 1.596 2.022 0 1 0.224
## Item.4 0.608 2.345 0 1 1.633
## Item.5 0.742 2.039 0 1 0.163
##
## $means
## Theta
## 0
##
## $cov
## Theta
## Theta 1
coef(mod2PL_asym, simplify=TRUE, IRTpars=TRUE)
## $items
## a b g u S
## Item.1 0.923 -3.225 0 1 2.863
## Item.2 2.290 0.772 0 1 0.213
## Item.3 1.596 -1.267 0 1 1.251
## Item.4 0.608 -3.855 0 1 5.120
## Item.5 0.742 -2.747 0 1 1.177
##
## $means
## Theta
## 0
##
## $cov
## Theta
## Theta 1
# no big difference statistically or visually
anova(mod2PL, mod2PL_asym)
## AIC SABIC HQ BIC logLik X2 df p
## mod2PL 5337.610 5354.927 5356.263 5386.688 -2658.805
## mod2PL_asym 5345.743 5371.719 5373.723 5419.360 -2657.872 1.867 5 0.867
plot(mod2PL, type = 'trace')
plot(mod2PL_asym, type = 'trace')
## End(No test)