We introduce a shiny web application to facilitate the construction of Item Factor Analysis (a.k.a. Item Response Theory) models using the *OpenMx* package. The web application assists with importing data, outcome recoding, and model specification. However, the app does not conduct any analysis but, rather, generates an analysis script. Generated Rmarkdown output serves dual purposes: to analyze a data set and demonstrate good programming practices. The app can be used as a teaching tool or as a starting point for custom analysis scripts.

*OpenMx*, a modular package
originally designed for structural equation modeling (Neale et al. in press),
recently gained the ability to handle Item Factor Analysis (a.k.a. Item
Response Theory, Modern Test Theory) models (Pritikin et al. 2015). Although a
goal of *OpenMx* is to cater to the statistical power user and
facilitate analyses that are difficult to conduct in other software, the
development team is always on the lookout for ways to ease the learning
curve for novice users as well. Here we introduce a new
*shiny* (RStudio and Inc. 2014) web
application to generate *OpenMx* code in
*Rmarkdown* format
(Allaire et al. 2014). We believe this code generator substantially lowers the
barrier to entry for novice users of Item Factor Analysis (IFA) and
encourages a culture of literate programming (Knuth 1984) and
reproducible science (Peng 2011; Nosek et al. 2015). The generated code can be
customized at many levels. This flexibility enables the production of
custom analyses and reports as users grow more sophisticated in their
modeling expectations.

Item analysis is concerned with items that are scored correct/incorrect
or on an ordinal scale. Many psychological surveys use an ordinal scale.
For example, participants may be asked to respond to an item like, “I
feel I am in charge of the situation in which I live.” on a 5-point
Likert scale from *agree* to *disagree*. Whether dichotomous or ordinal,
the conditional likelihood of response \(x_{ij}\) to item \(j\) from person
\(i\) with item parameters \(\xi_j\) and latent ability (a.k.a. latent
trait) \(\theta_i\) is
\[\begin{aligned}
\label{eq:response}
L(x_i|\xi,\theta_i) = \prod_j \mathrm{Pr}(\mathrm{pick}=x_{ij} | \xi_j,\theta_i).
\end{aligned} \tag{1}\]
One implication of Equation (1) is that items are assumed
to be conditionally independent given the latent ability \(\theta_i\).
That is, the outcome of one item does not have any influence on another
item after controlling for \(\xi\) and \(\theta_i\). The unconditional
likelihood is obtained by integrating over the latent distribution
\(\theta_i\),
\[\begin{aligned}
\label{eq:integral}
L(x_i|\xi) = \int L(x_i|\xi, \theta_i) L(\theta_i) \mathrm{d}\theta_i.
\end{aligned} \tag{2}\]
With an assumption that examinees are independently and identically
distributed, we can sum the individual log likelihoods,
\[\begin{aligned}
\label{eq:ifa-model}
\mathcal{L}=\sum_i \log L(x_i | \xi).
\end{aligned} \tag{3}\]
Optimization consists of finding the \(\xi\) that maximizes this function.
*OpenMx* presently offers only one choice for optimization, an
Expectation-Maximization algorithm using equal interval quadrature to
evaluate the integral in Equation (2) (Bock and Aitkin 1981). In the
future, we plan to add complementary algorithms such as
Metropolis-Hastings Robbins-Monro, that is more efficient at optimizing
certain problems (Cai 2010b).

Several models are readily available to plug in as the response probability function \(\mathrm{Pr}(\mathrm{pick}=x_{ij} | \xi_j,\theta_i)\) in Equation (1). All of these response probability functions are built from the logistic function, \[\text{logistic}(l) \equiv \text{logit}^{-1}(l) \equiv \frac{1}{1+\exp(-l)}.\] Details of the parameterizations are given here. A discussion of these item models more appealing to intuition is given in the next section.

The dichotomous response probability function can model items when there are exactly two possible outcomes. It is defined as, \[\begin{aligned} \label{eq:dichotomous} \mathrm{Pr}(\text{pick}=0| a,b,g,u,\tau) &= 1- \mathrm{Pr}(\text{pick}=1| a,b,g,u,\tau) \\ \mathrm{Pr}(\text{pick}=1| a,b,g,u,\tau) &= \text{logit}^{-1}(g)+(\text{logit}^{-1}(u)-\text{logit}^{-1}(g))\frac{1}{1+\exp(-( a\tau + b))} \end{aligned} \tag{4}\]

where \(a\) is the slope, \(b\) is the intercept, \(g\) is the pseudo-guessing
lower asymptote expressed in logit units, \(u\) is the upper asymptote
expressed in logit units, and \(\tau\) is the latent ability of the
examinee (Birnbaum 1968; Loken and Rulison 2010). A *#PL* naming shorthand has been
developed to refer to versions of the dichotomous model with different
numbers of free parameters. Model \(n\)PL refers to the model obtained by
freeing the first \(n\) of parameters \(b\), \(a\), \(g\), and \(u\).

The graded response model is a response probability function for 2 or more outcomes (Samejima 1969; Cai 2010b). For outcomes k in 0 to K, slope vector \(a\), intercept vector \(b\), and latent ability vector \(\tau\), it is defined as, \[\begin{aligned} \mathrm{Pr}(\text{pick}=K| a, b,\tau) &= \frac{1}{1+\exp(-( a\tau + b_K))}\\ \mathrm{Pr}(\text{pick}=k| a, b,\tau) &= \frac{1}{1+\exp(-( a\tau + b_k))} - \frac{1}{1+\exp(-( a\tau + b_{k+1}))} \\ \mathrm{Pr}(\text{pick}=0| a, b,\tau) &= 1- \mathrm{Pr}(\text{pick}=1| a,b_1,\tau). \end{aligned}\]

The nominal model is a response probability function for items with 3 or more outcomes (e.g., Thissen et al. 2010). It can be defined as, \[\begin{aligned} \label{eq:nominalT} a &= T_a \alpha \\ c &= T_c \gamma \\ \mathrm{Pr}(\text{pick}=k| s,a_k,c_k,\tau) &= C\ \frac{1}{1+\exp(-( s \tau a_k + c_k))} \end{aligned} \tag{5}\] where \(a_k\) and \(c_k\) are the result of multiplying two vectors of free parameters \(\alpha\) and \(\gamma\) by fixed matrices \(T_a\) and \(T_c\), respectively; \(a_0\) and \(c_0\) are fixed to 0 for identification; \(s\) is the per-item slope; and \(C\) is a normalizing constant to ensure that \(\sum_k \mathrm{Pr}(\text{pick}=k) = 1\).

Modern test theory employs item models, \(\mathrm{Pr}(\mathrm{pick}=x_{ij} | \xi_j,\theta_i)\) (from Equation (1)). To better appreciate how modern test theory works, it is helpful to develop an intuitive understanding of item models. The essential idea is the conversion of ordinal (or dichotomous) data into continuous data conditional on examinee skill. In Figure 1, the black dots represent the dichotomous data. Here we assume that examinee skill is known so that we can plot the black dots at the appropriate place on the \(x\) axis. The next step is to partition the \(x\) axis into equal interval bins. The proportion of examinees who responded correctly is displayed in blue in the middle of each bin. These blue numbers are our new continuous data, conditional on examinee skill. While we assumed that examinee skill was known, this assumption is actually unnecessary. The optimization algorithm can make a rough estimate of examinee skill, proceed to improve the model, and repeat this process until change is less than some epsilon.

To further inform your intuition about item models, it can be helpful to
place yourself in the position of the optimization algorithm. Enter the
following commands to launch the model explorer tool and browse to the
output web server address. It is possible to do this without *RStudio*,
but *RStudio* makes everything easier so we recommend using *RStudio*.
Note that the port number (`3726`

printed below) may be different on
your computer.

```
> library(ifaTools)
> itemModelExplorer()
://127.0.0.1:3726 Listening on http
```

Your browser should show a screen similar to Figure 2. Try
experimenting with all the controls. Early in the development of item
models, model parameters closely corresponded to the psychological
concepts of *difficulty* and *discrimination* (Birnbaum 1968). For
example, difficult items are only answered correctly by the brightest
examinees while most examinees may correctly answer easy items.
*Discrimination* quantifies how much we learn from a given response.
Well-designed items discriminate examinee skill. The causes of poor item
discrimination are many. An item may be hurt by awkward wording, by
asking examinees something that is somewhat off-topic, or by asking the
same question in slightly different ways.

Some item model parameters still retain a close connection to difficulty and discrimination. For example, the dichotomous model’s \(a\) parameter corresponds with discrimination and the negative \(b\) parameter divided by \(a\) corresponds with difficulty (Equation (4)). However, as item models have grown more flexible, the parameter values and intuitive interpretation have become more distant. To understand item models in general, it is helpful to plot category curves and information by the latent trait (Figure 2). Some examples of latent traits which can be measured in this framework are mathematical skill, vocabulary, or sleep quality.

The way to interpret these plots is to start by picking a value for the latent trait. Suppose we know that examinee Alice has a latent ability of 2 logit units. If we trace across the plot where the \(x\) axis is 2 then we find that Alice has a 75% chance of getting the item correct (blue curve) and a 25% chance of getting it incorrect (red curve). In addition, we find that this item will give us 0.05 units of information about Alice (black curve). The difficulty of the item is where the correct and incorrect curves cross at about 0.2 logits. The discrimination of the item is given by the information plot. This item provides more information about examinees with latent skill between \(-1\) and \(2\) than elsewhere on the skill continuum.

Much can be gleaned about item models by inspection of these plots.
However, it is worth conveying a few additional details specific to
particular item models. The dichotomous model’s \(g\) and \(u\) asymptote
parameters are in logit units. To transform these values back into
probabilities use R’s `plogis`

function. The \(g\) parameter may represent
the chance of an examinee guessing the item correctly. This parameter is
also often called the pseudo-guessing parameter due to the probability
of a low ability examinee getting an item correct at a non-zero
asymptote. The \(u\) parameter, or upper asymptote parameter, may
represent the chance of an examinee committing a careless mistake,
reflecting high ability examinee behavior. In this case, the upper
asymptote never reaches one (Loken and Rulison 2010).

By default, the nominal model uses *trend* for the `T.a`

and `T.c`

matrices (Equation (5)). This parameterization is also
known as the Fourier basis. The effect is that the `alf`

and `gam`

parameters control the lowest frequency variation to the highest
frequency variation. To develop an intuition for how this works, set all
parameters to zero then set `a`

, `alf1`

and `gam2`

to 1. Experiment with
the `gam`

parameters before you experiment with the `alf`

parameters.
Refer to Thissen et al. (2010) for discussion of the possibilities of this item
model. Custom `T.a`

and `T.c`

matrices are not available in the model
explorer app, but can be specified in R code.

The “`Show/Edit parameters`

” checkbox has a special didactic purpose.
Open two copies of the item model explorer. On one copy, un-check the
“`Show/Edit parameters`

” checkbox to hide the parameters and click the
“`Draw new parameters`

” button. On the second copy of the item model
explorer, adjust the item model parameters to try to match the plot
produced on the first item model explorer. You can check your answers by
checking the “`Show/Edit parameters`

” checkbox. When you play this game,
you are doing part of what the optimization algorithm does when it fits
models to data. Note that there is no need to practice this skill. The
computer will do it for you.

Enter the following commands to launch the model builder tool and browse
to the output web server address. As before, it is possible to do this
without *RStudio*, but *RStudio* makes everything easier so we recommend
using *RStudio*. Note that the port number (`3726`

printed below) may be
different on your computer.

```
> library(ifaTools)
> modelBuilder()
://127.0.0.1:3726 Listening on http
```

Your browser should show a screen similar to
Figure 3. Take care not to hit the `Reload`

button
because that will reset the app. Learn how to save your work (detailed
below) before you experiment with the `Reload`

button. Across the top
are tabs that organize the major functions of the model builder app. On
the left side is a control panel for the currently selected tab `Data`

.
Example data sets are available at the bottom of the control panel. You
are welcome to experiment with these, but we will focus on the process
of loading an external data set. If you prefer to follow along with a
video then browse to http://youtu.be/xHeb5_CWnCk for dichotomous data
and http://youtu.be/iwtpleltteQ for polytomous data.

Click on the “`Choose File`

” button^{1} and select `g341-19.csv`

, a
dichotomous data set that is available in the
*ifaTools* package
(Pritikin 2015a). The first 6 lines will appear in the “`Unparsed content`

”
section (see Figure 4).^{2} This makes it easy to
see how the file is formatted. The “`Parsed content`

” section reports an
error. By reading the error carefully, you will find mention that
“duplicate ‘row.names’ are not allowed.” Since “`Row names?`

” is enabled
in the control panel, the model builder app expects the first column of
data to be dedicated to row names. A row name is typically the
examinee’s name or numeric identifier. A glance at the unparsed content
reveals that no row names have been given in this data set.

Click the “`Row names?`

” checkbox in the control panel to disable row
names. Immediately (without reloading the data), the error message in
the “`Parsed content`

” section will be replaced by some of the data
organized into a single column. The column name will read
`X010111111100`

. A column name like that should raise suspicion. Since
the “`Header?`

” checkbox is enabled in the control panel, the model
builder app expects the first line of the data to contain column names.
Therefore, the first line of data is misinterpreted.

Click the “`Header?`

” checkbox in the control panel to disable column
names. The column in the “`Parsed content`

” section will now be labeled
`V1`

. Click on the “`Item summary`

” control as an alternate way to
verify that the data is loaded and parsed accurately. The main content
area includes two elements, a selection for the “`Row frequency column`

”
and a table of items by `Outcomes`

and `Missing`

(see
Figure 5). The “`Row frequency column`

” selection
is used when you have already reduced your data to unique rows and row
counts. The example data set `LSAT6`

is in this format. For our current
data set, leave “`Row frequency column`

” set to \(-\).

The information conveyed in the item table should rouse suspicion. There
is only 1 row (or 1 item) with 721 outcomes. What happened is that the
parsing is still not working and all the items are treated as a single
column. For example, the first row “0 1 0 1 1 1 1 1 1 1 0 0” is not
treated as 12 separate numbers but as a single uninterpreted unit. To
fix the parsing, we need to select `Whitespace`

as the `Separator`

in
the control panel. With this last error corrected, the table is updated
with 12 items labeled V1, V2, …, V12 and all with 2 outcomes. With the
data correctly loaded, click on the “`Outcomes`

”tab on the top bar.

The control panel on the “`Outcomes`

” tab packs a lot of functionality
(Figure 6). The first two selectors are mutually
exclusive and permit working with all items having the same outcomes or
with specific items, respectively. The outcome set “V1” is currently
selected as seen in the control panel on the left side. In these data,
all items have the same possible outcomes (0 or 1). Therefore, there is
only one outcome set. The name “V1” does not just refer to the item “V1”
but to all items, because all items have the same outcomes.

For clarity, it is often helpful to rename outcomes. The “`Recode from`

”
selector should have “0” selected. Change the `to`

selector to
`<Rename>`

, enter “incorrect” for the “`New name`

” value, and click the
“`Add mapping`

” button. This will create a recoding rule that will show
up in the “`Recode Table`

” output (Figure 7).
Similarly, rename the “1” outcome to “correct” and again click the
“`Add mapping`

” button. At this point, you should have 2 rules in the
“`Recode Table`

” output.