What’s for dynr: A Package for Linear and Nonlinear Dynamic Modeling in R

Intensive longitudinal data in the behavioral sciences are often noisy, multivariate in nature, and may involve multiple units undergoing regime switches by showing discontinuities interspersed with continuous dynamics. Despite increasing interest in using linear and nonlinear differential/difference equation models with regime switches, there has been a scarcity of software packages that are fast and freely accessible. We have created an R package called dynr that can handle a broad class of linear and nonlinear discrete- and continuous-time models, with regime-switching properties and linear Gaussian measurement functions, in C, while maintaining simple and easy-to-learn model specification functions in R. We present the mathematical and computational bases used by the dynr R package, and present two illustrative examples to demonstrate the unique features of dynr.

Lu Ou (ACTNext) , Michael D. Hunter (Georgia Institute of Technology) , Sy-Miin Chow (Department of Human Development and Family Studies, The Pennsylvania State University)
2019-08-15

1 Introduction

The past several decades have seen a significant rise in the prevalence of intensive longitudinal data (ILD), particularly in the social and behavioral sciences (Stone et al. 2008; Byrom and Tiplady 2010; Bolger and Laurenceau 2013). Differential equation and difference equation models in the form of state-space models have been one of the most dominant tools for representing the dynamics of ILD in disciplines such as the physical sciences, econometrics, engineering, and ecology. In parallel, some computational advances have been proposed in estimating regime-switching models — namely, models positing how otherwise continuous dynamic processes may undergo discontinuous changes through categorical but unobserved phases known as “regimes” (Hamilton 1989; Kim and Nelson 1999; Dolan 2009; Muthén and Asparouhov 2011; Chow et al. 2013, 2015). Throughout, we use the terms regimes and classes interchangeably to denote unobserved unit- and time-specific indicator variables that serve to group portions of repeated measures into phases with homogeneous dynamics or measurement properties.

Examples of regime-switching phenomena from psychology includes Piaget’s ((1969)) theory of human cognitive development and related extensions (van der Maas and Molenaar 1992; Hosenfeld 1997; Dolan et al. 2004); Kohlberg’s (Kohlberg and Kramer 1969) conceptualization of stagewise development in moral reasoning; Van Dijk and Van Geert’s ((2007)) findings on discrete shifts in early language development; as well as Fukuda and Ishihara’s ((1997)) work on the discontinuous changes in infant sleep and wakefulness rhythm during the first six months of life. Related to, but distinct from, hidden Markov models (Elliott et al. 1995; Visser 2007), regime-switching differential and difference equation models allow researchers to specify targeted differential or difference functions to describe the continuous changes that occur within regimes. Ample work exists on fitting these models (Tong and Lim 1980; Hamilton 1989; Tiao and Tsay 1994; Dolan 2009; Yang and Chow 2010; Muthén and Asparouhov 2011; Chow et al. 2013, 2015; Chow and Zhang 2013), but readily accessible software suited for handling such models with ILD are lacking.

Several programs and packages exist for fitting differential equation, difference equation, and hidden Markov models. However, each program has certain limitations that dynr (Ou et al. 2018) aims to overcome. Speaking broadly, the largest differences between dynr and other packages are threefold: (1) dynr readily allows for multi-unit models, (2) dynr allows for nonlinear discrete-time and continuous-time dynamics, and (3) dynr allows for regime switching throughout every part of the model. Many R packages exist for univariate and multivariate time series. CRAN lists hundreds of packages in its task view for TimeSeries (Hyndman 2016), a complete review of which is well-beyond the scope of this work. However, generally these packages lack facilities for fitting time series from multiple units. Likewise there are very few software utilities designed for nonlinear dynamics or regime switching (see Table 1 for an overview). (Petris and Petrone 2011) reviewed three packages for linear state-space models: dlm (Petris 2010, 2014), KFAS (Helske 2017a,b), and dse (Gilbert 2006 or later; Gilbert 2015). These are among the state of the art for state-space modeling in R. Although KFAS can accommodate in its measurement model all densities within the exponential family, the corresponding dynamic model is required to be linear. In addition to these R packages, the OpenMx 2.0 release (Neale et al. 2016; Boker et al. 2017) has maximum likelihood time-varying linear discrete- and continuous-time state-space modeling (Hunter 2017). Likewise, the MKFM6 program (Dolan 2005) implements methods of (Harvey 1989) for time-invariant linear state-space models. SsfPack (Koopman et al. 1999) implements the methods of (Durbin and Koopman 2001) for linear state-space modeling and Markov chain Monte Carlo methods for nonlinear modeling, but it is primarily restricted to single-unit time series without regime switching. The ctsem package (Driver et al. 2017a; Driver et al. 2017b) has utilities for linear state-space modeling of multiple units in continuous time, but lacks functionality for nonlinear models or regime switching. MATLAB (The MathWorks, Inc. 2016) has numerous extensions for time series and state-space modeling (Grewal and Andrews 2008), but lacks the ability to include regime switching and multiple units. Some R packages that handle regime switching are only designed for hidden Markov models, for example, depmixS4 (Visser and Speekenbrink 2010, 2016) and RHmm (Taramasco and Bauer 2012), while the others are only for specific Markov-switching discrete-time time-series models, including MSwM (Sanchez-Espigares and Lopez-Moreno 2014) for univariate autoregressive models, MSBVAR (Brandt 2016) for vector autoregressive models, and MSGARCH (Ardia et al. 2017) for generalized autoregressive conditional heteroskedasticity models. The pomp package (King et al. 2016, 2018) lists among its features hidden Markov models and state-space models, both of which can be discrete- or continuous-time, non-Gaussian, and nonlinear. However, pomp does not currently support regime-switching functionality beyond the regime switching found in hidden Markov modeling. (Helske 2017a) included a review of numerous other packages for non-Gaussian time series models which generally do not involve latent variables.

Overall, developments in fitting differential/difference equation models that evidence discontinuities in dynamics are still nascent. Despite some of the above-mentioned advances in computational algorithms, there is currently no readily available software package that allows researchers to fit differential/difference equations with regime-switching properties. As stated previously, currently available computational programs for dynamic modeling are limited in one of several ways: (1) they are restricted to handling only linear models within regimes such as the package OpenMx, (2) they can only handle very specific forms of nonlinear relations among latent variables, (3) they are computationally slow, (4) they do not allow for stochastic qualitative shifts in the dynamics over time, or (5) they require that the user write complex compiled code to enhance computational speed at the cost of high user burden. Efficient and user-friendly computer software needs to be developed to overcome these restrictions so the estimation of dynamic models can become more applicable by researchers.

We present an R package, dynr, that allows users to fit both linear and nonlinear differential and difference equation models with regime-switching properties. All computations are performed quickly and efficiently in C, but are tied to a user interface in the familiar R language. Specifically, for a very broad class of linear and nonlinear differential/difference equation models with linear Gaussian measurement functions, dynr provides R helper functions that write appropriate C code based on user input in R into a local (potentially temporary) C file, which is then compiled on user’s end with a call to an R function in dynr. The C function pointers are passed to the back-end for computation of a negative log-likelihood function, which is numerically optimized also in C using the optimization routine SLSQP (Kraft 1988, 1994) for parameter estimation. During the process, the user never has to write or even see the C code that underlies dynr and yet, the computations are performed entirely in C, with no interchanges between R and C to reduce memory copying and optimize speed. This removes some of the barriers to dynamic modeling, opening it as a possibility to a broader class of users, while retaining the flexibility of specifying targeted model-specific functions in C for users wishing to pursue models that are not yet supported in the R interface.

In the remaining sections, we will first present the mathematical and computational bases of the dynr R package, and then demonstrate the interface of dynr for modeling multivariate observations with Gaussian measurement errors using two ILD modeling examples from the social and behavioral sciences. Key features of the dynr package we seek to highlight include: (1) dynr fits discrete- and continuous-time dynamic models to multivariate longitudinal/time-series data; (2) dynr deals with dynamic models with regime-switching properties; (3) for improved speed, dynr computes and optimizes negative log-likelihood function values in C; (4) dynr handles linear and nonlinear dynamic models with an easy-to-use interface that includes a matrix form (for linear dynamic models only) and formula form (for linear as well as nonlinear models); (5) dynr removes the burden on the user to perform analytic differentiation in fitting nonlinear differential/difference equation models by providing the user with R’s symbolic differentiation; and (6) dynr provides ready-to-present results through LaTeX equations and plots.

2 General modeling framework

At a basic level, our general modeling framework comprises a dynamic model and a measurement model. The former describes the ways in which the latent variables change over time, whereas the latter portrays the relationships between the observed variables and latent variables at a specific time.

The dynamic model for a particular regime in continuous-time assumes the following form: \[d \boldsymbol{\eta}_i (t) = \boldsymbol{f}_{S_i(t)}\left(\boldsymbol{\eta}_i(t), t, \boldsymbol{x}_i(t) \right)dt + d\boldsymbol{w}_i(t),\label{RS-SDE} \tag{1}\] where \(i\) indexes the smallest independent unit of analysis, \(t\) indexes time, \(\boldsymbol{\eta}_i(t)\) is the \(r\times 1\) vector of latent variables at time \(t\), \(\boldsymbol{x}_i(t)\) is the vector of covariates at time \(t\), and \(\boldsymbol{f}_{S_i(t)}(.)\) is the vector of (possibly nonlinear) dynamic functions which depend on the latent regime indicator, \(S_i(t)\). The left-hand side of Equation (1), \(d \boldsymbol{\eta}_i (t)\), gives the differential of the vector of continuous latent variables, \(\boldsymbol{\eta}_i(t)\), and \(\boldsymbol{f}_{S_i(t)}(.)\) is called the drift function. Added to these deterministic changes induced by the drift function is \(\boldsymbol{w}_i(t)\), an \(r\)-dimensional Wiener process. The differentials of the Wiener processes have zero means and covariance matrix, \({\bf Q}_{S_i(t)}\), called the diffusion matrix. When the dynamic model consists only of linear functions, Equation (1) reduces to: \[d \boldsymbol{\eta_i} (t) = \left( \boldsymbol{\alpha}_{S_i(t)} + \boldsymbol{F}_{S_i(t)} \boldsymbol{\eta_i} (t) + \boldsymbol{B}_{S_i(t)} \boldsymbol{x}_i(t) \right) dt + d \boldsymbol{w}_i(t).\label{eq:linDynC} \tag{2}\] where the general function \(\boldsymbol{f}_{S_i(t)}()\) is replaced with a linear function consisting of (1) an intercept term \(\boldsymbol{\alpha}_{S_i(t)}\), (2) linear dynamics in a matrix \(\boldsymbol{F}_{S_i(t)}\), and (3) linear covariate regression effects \(\boldsymbol{B}_{S_i(t)}\).

For discrete-time processes, we adopt a dynamic model in state-space form (Durbin and Koopman 2001) as \[\boldsymbol{\eta_i} (t_{i,j+1}) = \boldsymbol{f}_{S_i(t_{i,j})}\left(\boldsymbol{\eta_i} (t_{i,j}), t_{i,j}, \boldsymbol{x}_i(t_{i,j+1}) \right) + \boldsymbol{w}_i(t_{i,j+1}),\label{RS-dyn} \tag{3}\] now postulated to unfold at discrete time points indexed by sequential positive integers, \(t_{i,j}\), \(j = 1, 2, \cdots\). In this case, \(\boldsymbol{w}_i(t_{i,j})\) denotes a vector of Gaussian distributed process noise with covariance matrix, \({\bf Q}_{S_i(t_{i,j})}\). We have intentionally kept notation similar between discrete- and continuous-time models to facilitate their linkage. dynr allows for an easy transition between these two frameworks with a binary flag. In a similar vein, we refer to \(\boldsymbol{f}_{S_i(t)}(.)\) in both Equations (1) and (3) broadly as the dynamic functions. The same structure as Equation (2) is possible in discrete time as the linear analog of Equation (3), \[\boldsymbol{\eta_i} (t_{i,j+1}) = \boldsymbol{\alpha}_{S_i(t_{i,j})} + \boldsymbol{F}_{S_i(t_{i,j})} \boldsymbol{\eta_i} (t_{i,j}) + \boldsymbol{B}_{S_i(t_{i,j})} \boldsymbol{x}_i(t_{i,j+1}) + \boldsymbol{w}_i(t_{i,j+1}).\label{eq:linDyn} \tag{4}\]

In both the discrete- and continuous-time cases, the initial conditions for the dynamic functions are defined explicitly to be the latent variables at a unit-specific first observed time point, \(t_{i,1}\), denoted as \(\boldsymbol{\eta}_i(t_{i,1})\), and are specified to be normally distributed with means \(\boldsymbol{\mu_{\eta_1}}\) and covariance matrix, \(\boldsymbol{\Sigma_{\eta_1}}\): \[\boldsymbol{\eta}_i(t_{i,1})\sim N\left(\boldsymbol{\mu_{\eta_1}}, \boldsymbol{\Sigma_{\eta_1}}\right). \label{initial} \tag{5}\]

Likewise for both discrete- and continuous-time models, we assume that observations only occur at selected, discrete time points. Thus, we have a discrete-time measurement model in which \(\boldsymbol{\eta}_i(t_{i,j})\) at discrete time point \(t_{i,j}\) is indicated by a \(p\) \(\times\) 1 vector of manifest observations, \(\boldsymbol{y}_i(t_{i,j})\). Continuous-time processes allow unequal time intervals for these observations. Missing data may be present under either specification. The vector of manifest observations is linked to the latent variables as \[\boldsymbol{y}_i(t_{i,j})=\mathbf{\tau}_{S_i(t_{i,j})}+ {\mathbf{\Lambda}}_{S_i(t_{i,j})} \boldsymbol{\eta}_i(t_{i,j})+ \boldsymbol{\mathrm{A}}_{S_i(t_{i,j})} \boldsymbol{x}_i(t_{i,j}) + \boldsymbol{\epsilon}_i(t_{i,j}), \text{\hskip .1in} {\mathbf{\epsilon}}_i(t_{i,j}) \sim N\left(\mathbf{0}, {\bf R}_{S_i(t_{i,j})}\right), \label{SSMeaSt} \tag{6}\] where \(\mathbf{\tau}_{S_i(t_{i,j})}\) is a \(p \times 1\) vector of intercepts, \(\boldsymbol{\mathrm{A}}_{S_i(t_{i,j})}\) is a matrix of regression weights for the covariates, \({\mathbf{\Lambda}}_{S_i(t_{i,j})}\) is a \(p \times r\) factor loadings matrix that links the observed variables to the latent variables, and \({\mathbf{\epsilon}}_i(t_{i,j})\) is a \(p \times 1\) vector of measurement errors assumed to be serially uncorrelated over time and normally distributed with zero means and covariance matrix, \({\bf R}_{S_i(t_{i,j})}\). Of course, all parts of the measurement model may be regime-dependent.

The subscript \(S_i(t)\) in Equations (1)(6) indicates that these functions and matrices may depend on \(S_i(t)\), the operating regime. To make inferences on \(S_i(t_{i,j})\), we initialize the categorical latent variable \(S_i(t_{i,j})\) on the first occasion and then provide a model for how \(S_i(t_{i,j})\) changes over time. The initial regime probabilities for \(S_i(t_{i,1})\) are represented using a multinomial regression model as \[\Pr\big(S_i(t_{i,1}) = m |\boldsymbol{x}_i(t_{i,1})\big) \stackrel{\Delta}{=}\pi_{m,i1} = \frac{\exp(a_{m} + {\bf b}^T_{m}\boldsymbol{x}_i(t_{i,1}))}{\sum_{k=1}^{M} \exp(a_{k} +{\bf b}^T_{k} \boldsymbol{x}_i(t_{i,1}))}, \label{MultinomialReg0} \tag{7}\] where \(M\) denotes the total number of regimes, \(a_{m}\) is the logit intercept for the \(m\)th regime and \({\bf b}_{m}\) is a \(n_b \times 1\) vector of regression slopes linked to a vector of covariates that explain between-unit differences in initial log-odds (LO). For identification, \(a_{m}\) and all entries in \({\bf b}_{m}\) are set to zero for some regime, \(m\).

We use a first-order Markov process to define how the classes change over time in a transition probability matrix, which contains all possible transitions from one regime to another. In the matrix, the rows index the previous regime at time \(t_{i,j-1}\) and the columns index the current regime at time \(t_{i,j}\). The rows of this matrix sum to 1 because the probability of transitioning from a particular state to any other state must be 1. This transition matrix may also depend on covariates. Thus, a multinomial logistic regression equation is assumed to govern the probabilities of transitions between regimes as: \[\Pr\big(S_i(t_{i,j})= m | S_i(t_{i,j-1})=l, \boldsymbol{x}_i(t_{i,j})\big) \stackrel{\Delta}{=}\pi_{lm,it} = \frac{\exp(c_{lm} +{\bf d}^T_{lm}\boldsymbol{x}_i(t_{i,j})}{\sum_{k =1}^{M} \exp(c_{lk} +{\bf d}^T_{lk}\boldsymbol{x}_i(t_{i,j}))}, \label{MultinomialRegNoDev} \tag{8}\] where \(\pi_{lm,it}\) denotes unit \(i\)’s probability of transitioning from class \(l\) at time \(t_{i,j-1}\) to class \(m\) at time \(t_{i,j}\), \(c_{lm}\) denotes the logit intercept for the transition probability, and \({\bf d}_{lm}\) is a \(n_d\times 1\) vector of logit slopes summarizing the effects of the covariates in \(\boldsymbol{x}_i(t_{i,j})\) on that transition probability. One regime, again, has to be specified as the reference regime by fixing all LO parameters, including \(c_{lm}\) and all elements in \({\bf d}^T_{lm}\) for some regime \(m\), to zero for identification purposes.

To summarize, the model depicted in Equations (1)(8) may take on the form of various linear or nonlinear dynamic models in continuous or discrete time. Moreover, these dynamic models may have regime-switching properties. Systematic between-unit differences stem primarily from changes in the unit- and time-specific regime, \(S_i(t_{i,j})\), and the corresponding changes in the dynamic and measurement models over units and occasions.

3 Estimation procedures

In this section, we outline the procedures implemented in dynr for estimating the model shown in Equations (1)(8). An overview of the estimation procedures involved, the different special cases handled by dynr, and the software packages that can handle these special cases are summarized in Table 1.

Discrete-time models

Broadly speaking, the estimation procedures implemented in dynr are based on the Kalman filter [KF; Kalman (1960)], its various continuous-time and nonlinear extensions, and the Kim filter (Anderson and Moore 1979; Kim and Nelson 1999; Bar-Shalom et al. 2001; Yang and Chow 2010; Chow and Zhang 2013; Kulikova and Kulikov 2014; Kulikov and Kulikova 2014; Chow et al. 2018). The Kim filter, designed to extend the Kalman filter to handle regime-switching state-space models, was proposed by Kim and Nelson (1999) and extended by Chow and Zhang (2013) to allow for nonlinear dynamic functions. In dynr, models are allowed to (1) be in discrete or continuous time, (2) be single regime or regime switching, (3) have linear or nonlinear dynamics, (4) involve stochastic or deterministic dynamics, and (5) have one or more units. All combinations of these variations are possible in dynr, creating 32 different kinds of models.

In the case of linear discrete-time dynamics without regime-switching, the model reduces to a linear state-space model, and we apply the Kalman filter to estimate the latent variable values and obtain other by-products for parameter optimization. At each time point, the KF consists of two steps. In the first step, the dynamics are used to make a prediction for the latent state at the next time point conditional on the observed measurements up to time \(t_{i,j-1}\), creating a predicted mean \(\hat{\boldsymbol{\eta}}_i(t_{i,j}|t_{i,j-1})= E(\boldsymbol{\eta}_i (t_{i,j})|{\boldsymbol{\mathrm{Y}}}_i(t_{i,j-1}))\) and covariance matrix for the latent state \({\boldsymbol{\mathrm{P}}}_i(t_{i,j}|t_{i,j-1})\) \(=\) \(Cov[\boldsymbol{\eta}_i (t_{i,j})| {\boldsymbol{\mathrm{Y}}}_i(t_{i,j-1})]\), where \({\boldsymbol{\mathrm{Y}}}_i(t_{i,j-1})\) includes manifest observations from time \(t_{i,1}\) up to time \(t_{i,j-1}\). In the second step, the prediction is updated based on the measurement model (Equation (6)) and the new measurements, yielding \(\hat{\boldsymbol{\eta}}_i(t_{i,j}|t_{i,j})=E(\boldsymbol{\eta}_i (t_{i,j})|{\boldsymbol{\mathrm{Y}}}_i(t_{i,j}))\) and associated covariance matrix, \({\boldsymbol{\mathrm{P}}}_i(t_{i,j}|t_{i,j})\) \(=\) \(Cov[\boldsymbol{\eta}_{it} | {\boldsymbol{\mathrm{Y}}}_i(t_{i,j})]\). Assuming that the measurement and process noise components are normally distributed and that the measurement equation is linear, as in Equation (6), the prediction errors, \({\boldsymbol{\mathrm{Y}}}_i(t_{i,j})- E({\boldsymbol{\mathrm{Y}}}_i(t_{i,j})|{\boldsymbol{\mathrm{Y}}}_i(t_{i,j}))\), are multivariate normally distributed. Thus, these by-products of the KF can be used to construct a log-likelihood function known as the prediction error decomposition function (De Jong 1988; Harvey 1989; Hamilton 1994; Chow et al. 2010). This log-likelihood function is optimized to yield maximum-likelihood (ML) estimates of all the time-invariant parameters, as well as to construct information criterion (IC) measures (Harvey 1989; Chow and Zhang 2013) such as the Akaike Information Criterion [AIC; Akaike (1973)] and Bayesian Information Criterion [BIC; Schwarz (1978)]. Standard errors of the parameter estimates are obtained by taking the square root of the diagonal elements of the inverse of the negative numerical Hessian matrix of the prediction error decomposition function at the point of convergence.

At convergence, other products from the linear KF include updated latent states, \(\hat{\boldsymbol{\eta}}_i(t_{i,j}|t_{i,j})\), and the updated latent covariance matrices, \({\boldsymbol{\mathrm{P}}}_i(t_{i,j}|t_{i,j})\). In the social and behavioral sciences, the entire time series of observations has often been collected prior to model fitting. In such cases, we use the fixed interval smoother (Anderson and Moore 1979; Ansley and Kohn 1985) to refine the latent variable estimates, yielding the smoothed latent variable estimates, \(\hat{\boldsymbol{\eta}}_i(t_{i,j}|T_{i})=E(\boldsymbol{\eta}_i (t_{i,j})|{\boldsymbol{\mathrm{Y}}}_i(T_{i}))\), and associated covariance matrices, \({\boldsymbol{\mathrm{P}}}_i(t_{i,j}|T_{i})\).

When the dynamic model takes on the form of a nonlinear state-space model with differentiable dynamic functions, the linear KF is replaced with the extended Kalman filter [EKF; Anderson and Moore (1979); Bar-Shalom et al. (2001)] so that the nonlinear dynamic functions are “linearized” or approximated by the first-order Taylor series. Then, a log-likelihood function can be constructed in similar form to the linear state-space prediction error decomposition. However, the corresponding parameter estimates are only “approximate” ML estimates due to the truncation errors in the EKF. The feasibility of this approach has been demonstrated by Chow et al. (2007).

When a linear state-space model is used as the dynamic model but it is characterized by regime-switching properties, dynr uses an extension of the KF, known as the Kim filter, and the related Kim smoother (Kim and Nelson 1999; Yang and Chow 2010). The Kim filter combines the KF, the Hamilton filter (Hamilton 1989) that yields filtered state probabilities, and a collapsing procedure to avoid the need to store \(M^2\) new values of \(\hat{\boldsymbol{\eta}}_i(t_{i,j}|t_{i,j})^{l,m}\) \(\stackrel{\Delta}{=}\) \(\mathbb{E}[\boldsymbol{\eta}_i (t_{i,j})| S_i(t_{i,j-1})= l, S_i(t_{i,j})= m, {\boldsymbol{\mathrm{Y}}}_i(t_{i,j})]\), as well as \({\boldsymbol{\mathrm{P}}}_i(t_{i,j}|t_{i,j})^{l,m}\) \(\stackrel{\Delta}{=}\) \(\mathop{\mathrm{Cov}}[\boldsymbol{\eta}_i (t_{i,j})| S_i(t_{i,j-1})= l, S_i(t_{i,j})= m, {\boldsymbol{\mathrm{Y}}}_i(t_{i,j})]\) with each additional time point. The collapsing procedure averages the estimates over the previous regime \(l\) so only the marginal estimates, \(\hat{\boldsymbol{\eta}}_i(t_{i,j}|t_{i,j})^m = E[\boldsymbol{\eta}_i (t_{i,j})| S_i(t_{i,j})=m, {\boldsymbol{\mathrm{Y}}}_i(t_{i,j})]\)), and the associated covariance matrix, \({\boldsymbol{\mathrm{P}}}_i(t_{i,j}|t_{i,j})^m\), need to be stored at each time step. To handle cases in which nonlinearities are present in Equation (3), a method proposed by Chow and Zhang (2013), called the extended Kim filter, is used for estimation instead. The extended Kim filter replaces the KF portion of the Kim filter with the EKF.

Continuous-time models

Finally, when the dynamics are in continuous time—whether composed of linear or nonlinear dynamic functions—the resultant estimation procedures are the continuous-discrete extended Kalman filter [CDEKF; Bar-Shalom et al. (2001); Kulikov and Kulikova (2014); Kulikova and Kulikov (2014)]. The CDEKF handles a single-regime special case of the general model shown in Equations (1)(6).

For continuous processes in the form of Equation (1), let \(\hat{\boldsymbol{\eta}}_i(t) = E(\boldsymbol{\eta}_i(t)|{\boldsymbol{\mathrm{Y}}}_i(t_{i,j-1}))\) and \({\boldsymbol{\mathrm{P}}}_i(t)=Cov[\boldsymbol{\eta}_i(t)|{\boldsymbol{\mathrm{Y}}}_i(t_{i,j-1})]\) denote the mean and covariance matrix of the latent variables, respectively, at time \(t\) in the interval \([t_{i,j-1},t_{i,j}]\). In the CDEKF framework, the prediction step of the KF is replaced by solving a set of ordinary differential equations (ODEs) at time \(t_{i,j}\), given the initial conditions at time \(t_{i,j-1}\): \(\hat{\boldsymbol{\eta}}_i(t_{i,j-1})=\hat{\boldsymbol{\eta}}_i(t_{i,j-1}|t_{i,j-1})\) and \({\boldsymbol{\mathrm{P}}}_i(t_{i,j-1})={\boldsymbol{\mathrm{P}}}_i(t_{i,j-1}|t_{i,j-1})\). This set of ODEs is obtained by only retaining the first term, \(\boldsymbol{f}_{S_i(t)}\left(\hat{\boldsymbol{\eta}}_i(t), t, \boldsymbol{x}_i(t) \right)\), in the Taylor series expansion of \(\boldsymbol{f}_{S_i(t)}\left(\boldsymbol{\eta}_i(t), t, \boldsymbol{x}_i(t) \right)\) around the expectation \(\hat{\boldsymbol{\eta}}_i(t)\), and is shown below: \[\begin{aligned} \frac{d\hat{\boldsymbol{\eta}}_i(t) }{dt} &= \boldsymbol{f}_{S_i(t)}\left(\hat{\boldsymbol{\eta}}_i(t), t, \boldsymbol{x}_i(t) \right),\label{deta}\\ \end{aligned} \tag{9}\]

\[\begin{aligned} \frac{d{\boldsymbol{\mathrm{P}}}_i(t)}{dt}&=\frac{\partial\boldsymbol{f}_{S_i(t)}\left(\hat{\boldsymbol{\eta}}_i(t), t, \boldsymbol{x}_i(t) \right)}{\partial\hat{\boldsymbol{\eta}}_i(t)}{\boldsymbol{\mathrm{P}}}(t)+{\boldsymbol{\mathrm{P}}}(t)\left(\frac{\partial\boldsymbol{f}_{S_i(t)}\left(\hat{\boldsymbol{\eta}}_i(t), t, \boldsymbol{x}_i(t) \right)}{\partial\hat{\boldsymbol{\eta}}_i(t)}\right)^{\top} + {\bf Q}_{S_i(t)},\label{dP} \end{aligned} \tag{10}\] where \(\frac{\partial\boldsymbol{f}_{S_i(t)}\left(\hat{\boldsymbol{\eta}}_i(t), t, \boldsymbol{x}_i(t) \right)}{\partial\hat{\boldsymbol{\eta}}_i(t)}\) is the Jacobian matrix of \(\boldsymbol{f}_{S_i(t)}\left(\hat{\boldsymbol{\eta}}_i(t), t, \boldsymbol{x}_i(t) \right)\) with respect to \(\hat{\boldsymbol{\eta}}_i(t)\) at time \(t\). Kulikov and Kulikova ((2014), Kulikova and Kulikov (2014)) suggested solving for equations (9) and (10) using adaptive ODE solvers. We adopt an approximate numerical solution — the fourth-order Runge-Kutta (Press et al. 2002) method — to solve Equations (9) and (10). In cases where the hypothesized continuous-time dynamics are linear, explicit analytic solutions exist and there is no need to use numerical solvers. However, in our simulation work, estimating known special cases of linear stochastic differential equation models using numerical solvers yielded both comparable estimates and computational time to estimating the same models using their known solutions. Thus, for generality, we utilize numerical solvers in solving both linear and nonlinear differential equations in dynr.

As in the case involving nonlinear discrete-time dynamic models, parameter estimates obtained from optimizing the log-likelihood function constructed from by-products of the CDEKF are also approximate ML estimates; however, the approximations now stem both from the truncation errors from the first-order Taylor series in the CDEKF, as well as the numerical solution of Equations (9) and (10).

In cases involving regime-switching ordinary or stochastic differential equations, the algorithms for estimating regime-switching continuous-time models are essentially estimation procedures that combine the CDEKF and part of the Kim filter designed to handle estimation of the regime-switching portion of the model. The resultant procedure, referred to herein as continuous-discrete extended Kim filter, is summarized in Chow et al. (2018).

Table 1: Models, algorithms, and software for the framework of regime-switching (non)linear state space models in discrete- and continuous-time. SDE = Stochastic Differential Equation, ODE = Ordinary Differential Equation, CD = Continuous-Discrete, RS = Regime-Switching, KF = Kalman filter (Kalman 1960), EKF = Extended Kalman filter (Anderson and Moore 1979; Bar-Shalom et al. 2001), Kim filter = KF + Hamilton filter + Collapsing procedure (Kim and Nelson 1999). Extended Kim filter was proposed by (Chow and Zhang 2013); the CD extended Kim filter is proposed by (Chow et al. 2018).
Discrete-time Continuous-time
Single-regime Linear Linear state-space model Linear SDE/ODE
KF CDEKF
dynr, OpenMx, pomp, KFAS, dlm, dse, dynr, pomp, OpenMx, ctsem,
MKFM6, SsfPack, MATLAB MATLAB
Nonlinear Nonlinear state-space model Nonlinear SDE/ODE
EKF CDEKF
dynr, pomp, SsfPack, MATLAB dynr, pomp, MATLAB
Multiple-regime Linear RS state-space model RS SDE/ODE
Kim filter CD Kim filter
dynr, dynr only
GAUSS code, MATLAB
Nonlinear RS nonlinear state-space model RS nonlinear SDE/ODE
Extended Kim filter CD extended Kim filter
dynr only dynr only

4 Steps for preparing and “cooking” a model

The theme around the naming convention exploits the pronunciation of the package name: dynr is pronounced the same as “dinner”. Therefore, the names of functions and methods are specifically designed to relate to things done surrounding dinner, such as gathering ingredients such as the data, preparing recipes, cooking, which involves combining ingredients according to a “modeling” recipe and applies heat, and serving the finished product.

The general procedure for using the dynr package can be summarized in five steps. First, data are gathered and identified with the dynr.data() function. Second, recipes are prepared. To each part of a model there is a corresponding prep.*() recipe function. Each of these functions creates an object of class "dynrRecipe". Each prep.*() function creates an object of class "dynr*" which is in turn a subclass of "dynrRecipe". These recipe functions include:

  1. The prep.measurement() function defines the measurement part of the model, that is, how latent variables and exogenous covariates map onto the observed variables.

  2. The prep.matrixDynamics() and prep.formulaDynamics() functions define the dynamics of the model with either a strictly linear, matrix interface or with a possibly nonlinear formula interface, respectively.

  3. The prep.initial() function defines the initial conditions of the model. The initial conditions are used by the recursive algorithms as the starting point for latent variable estimates. As such, the prep.initial() function describes the initial mean vector and covariance matrix of the latent variables, assumed to be multivariate normally distributed.

  4. The prep.noise() function defines the covariance structure for both the measurement (or observation) noise and the dynamic (or latent) noise.

  5. The prep.regimes() function provides the regime switching structure of the model. Single-regime models do not require a "dynrRegimes" object.

Once the data and recipes are prepared, the third step mixes the data and recipes together into a model object of class "dynrModel" with the dynr.model() function. Fourth, the model is cooked with dynr.cook() to estimate the free parameters and standard errors. Fifth and finally, results are served in summary tables using summary(), LaTeX  equations using printex(), and plots of trajectories and equations using plot(), dynr.ggplot(), autoplot(), and plotFormula().

We will demonstrate the interface of dynr using two examples: (1) a linear state-space example with regime-switching based on Yang and Chow (2010) and (2) a regime-switching extension of the predator-prey model (Lotka 1925; Volterra 1926).

5 Example 1: Regime-switching linear state-space model

Facial electromyography (EMG) has been used in the behavioral sciences as one possible indicator of human emotions (Schwartz 1975; Cacioppo and Petty 1981; Cacioppo et al. 1986; Dimberg et al. 2000). A time series of EMG data contains bursts of electrical activity that are typically magnified when an individual is under emotion induction. Yang and Chow (2010) proposed using a regime-switching linear state-space model in which the individual may transition between regimes with and without facial EMG activation. As such, heterogeneities in the dynamic patterns and variance of EMG data are also accounted for through the incorporation of these latent regimes. Model fitting was previously performed at the individual level. Data from the participant shown in Figure 1(A) are made available as part of the demonstrative examples in dynr. A complete modeling script for this example is available as a demo in dynr and can be found by calling file.edit(system.file("demo", "RSLinearDiscreteYang.R", package = "dynr")), and a full explanation is included as a package vignette called LinearDiscreteTimeModels.

Here we present selected segments of code to showcase how a linear state-space model with regime-switching can be specified in dynr. The model of interest is the final model selected for this participant by Yang and Chow (2010): \[\begin{aligned} &{y}_i(t_{i,j}) = \mu_{yS_i(t_{i,j})} + \beta_{S_i(t_{i,j})} \text{Self-report}(t_{i,j}) + \eta_i (t_{i,j}), \label{eq:emgMeas}\\ \end{aligned} \tag{11}\]

\[\begin{aligned} &\eta_i (t_{i,j+1}) = \phi_{S_i(t_{i,j})} \eta_i (t_{i,j}) + \zeta_i (t_{i,j+1}) , \label{eq:emgDyn} \end{aligned} \tag{12}\] in which we allowed the intercept, \(\mu_{yS_i(t_{i,j})}\); the regression slope, \(\beta_{S_i(t_{i,j})}\); and the autoregression coefficient, \(\phi_{S_i(t_{i,j})}\), to be regime-dependent. By allowing \(\phi_{S_i(t_{i,j})}\) to be regime-specific, we indirectly allowed the total variance of the latent component, \(\eta_i (t_{i,j+1})\), to be heterogeneous across the deactivation and activation stages, in spite of requiring the dynamic noise variance, \(E(\zeta_i(t)^2)\), to be constant across regimes.

graphic without alt textgraphic without alt text

Figure 1: (A) A plot of integrated electromyography (iEMG) and self–report affect ratings for one participant with a time interval of 0.2 seconds between two adjacent observations. Self–report = self–report affect ratings; iEMG = integrated EMG signals. (B) An automatic plot of the smoothed state estimates for the regime-switching linear state-space model.

The first step in dynr modeling is to structure the data. This is done with the dynr.data() function.

require("dynr")
data("EMG")
EMGdata <- dynr.data(EMG, id = 'id', time = 'time',
  observed = 'iEMG', covariates = 'SelfReport')

The first argument of this function is either a "ts" class object of single-unit time series or a "data.frame" object structured in a long relational format with different measurement occasions from the same unit appearing as different rows in the data frame. When a "ts" class object is passed to dynr.data(), no other inputs are needed. Otherwise, the id argument needs the name of the variable that distinguishes units, allowing multiple replicated time series to be analyzed together. The time argument needs the name of the variable that indicates unit-specific measurement occasions. If a discrete-time model is desired, the time variable should contain sequential positive integers. If the measurement occasions for a unit are sequential but not consecutive, NAs will be inserted automatically to create equally spaced data. If a continuous-time model is being specified, the time variable can contain unit-specific increasing sequences of irregularly spaced real numbers. In this particular example, a discrete-time model is used. The observed and covariates arguments are vectors of the names of the observed variables and covariates in the data.

The next step in dynr modeling is to build the recipes for the various parts of a model. The recipes are created with prep.*() functions.

The dynamic functions in Equations (1) and (3), can be specified using either prep.formulaDynamics() or prep.matrixDynamics(). In this example, the dynamics as in Equation (12) are linear and discrete-time, so we can describe the dynamics in terms of Equation (4) as \[\label{eq:ex1dyn} \eta_i (t_{i,j+1}) = \underbrace{0}_{\boldsymbol{\alpha}_{S_i(t_{i,j})}} + \underbrace{\phi_{S_i(t_{i,j})}}_{\boldsymbol{F}_{S_i(t_{i,j})}} \eta_i (t_{i,j}) + \underbrace{0}_{\boldsymbol{B}_{S_i(t_{i,j})}} \boldsymbol{x}_i(t_{i,j}) + \underbrace{\zeta_i (t_{i,j+1})}_{\boldsymbol{w}_i(t_{i,j+1})}. \tag{13}\] The prep.matrixDynamics() function allows the user to specify the structures of the intercept vector \(\boldsymbol{\alpha}_{S_i(t_{i,j})}\), through values.int and params.int; the covariate regression matrix \(\boldsymbol{B}_{S_i(t_{i,j})}\), through values.exo and params.exo; and the one-step-ahead transition matrix \(\boldsymbol{F}_{S_i(t_{i,j})}\), through values.dyn and params.dyn. We illustrate this function below. The values.dyn argument gives a list of matrices for the starting values of \(\boldsymbol{F}_{S_i(t_{i,j})}\). The params.dyn argument names the free parameters. These are the \(\phi_{S_t}\) in Equation (12). The isContinuousTime argument switches between continuous-time modeling and discrete-time modeling. The arguments corresponding to the intercepts (values.int and params.int) and the covariate effects (values.exo and params.exo) are omitted to leave these matrices as zeros.

recDyn <- prep.matrixDynamics(values.dyn = list(matrix(0.1, 1, 1), matrix(0.5, 1, 1)),
  params.dyn = list(matrix('phi_1', 1, 1), matrix('phi_2', 1, 1)),
  isContinuousTime = FALSE)

The noise recipe is created with prep.noise(). The noise recipe is stored in the recNoise object, an abbreviation for “recipe noise”. The latent noise covariance matrix is a \(1\times 1\) matrix with a free parameter called dynNoise, short for “dynamic noise”. The observed noise covariance matrix is also a \(1 \times 1\) matrix, but has the measurement noise variance fixed to zero using the special keyword fixed.

recNoise <- prep.noise(values.latent = matrix(1, 1, 1),
  params.latent = matrix('dynNoise', 1, 1),
  values.observed = matrix(0, 1, 1), params.observed = matrix('fixed', 1, 1))

The prep.regimes() function specifies the structure of the regime time evolution shown in Equation (8). In this example, we do not have any covariates in the regime-switching (RS) functions. The problem then reduces to the specification of a 2 \(\times\) 2 transition log-odds (LO) matrix. We provide starting values that imply persisting in the same regime is more likely than transitioning to another regime, and set the second regime LO to zero for identification, making it the reference regime. The first column of the transition LO matrix, is populated with the starting values of: (1) c11 \(= 0.7\), corresponding to \(exp(0.7)\) \(=\) \(2.01\) times greater LO of staying within the Deactivated regime as transitioning to the Activated regime; and (2) c21 \(=-1\), corresponding to \(exp(-1)\) \(=\) \(0.37\) times lower LO of transitioning to the Deactivated regime.

recReg <- prep.regimes(values = matrix(c(0.7, -1, 0, 0), 2, 2),
  params = matrix(c('c11', 'c21', 'fixed', 'fixed'), 2, 2))

In essence, the above code creates the following transition probability matrix:

graphic without alt text

In many situations it is useful to specify the structure of the transition LO matrix in deviation form — that is, to express the LO intercepts in all but the reference regime as deviations from the LO intercept in the reference regime. The package vignette illustrates this by invoking the deviation argument of prep.regimes().

After the recipes for all parts of the model are defined, the dynr.model() function creates the model and stores it in the "dynrModel" object. Each recipe object created by prep.*() and the data prepared by dynr.data() are given to this function. The dynr.model() function always requires dynamics, measurement, noise, initial, and data. When there are multiple regimes, the regimes argument should also be provided. When parameters are subject to transformation functions, a transform argument can be added, which will be discussed in the second example. The dynr.model() function combines information from the recipes and data to write the text for a C function. This text is written to a file optionally named by the outfile argument, so that the user can inspect or modify the generated C code. The default outfile is a temporary file returned by tempfile().

rsmod <- dynr.model(dynamics = recDyn, measurement = recMeas,
  noise = recNoise, initial = recIni, regimes = recReg,
  data = EMGdata, outfile = "RSLinearDiscreteYang.c")
yum <- dynr.cook(rsmod)

In the last line above, the model is “cooked” with the dynr.cook() function to estimate the free parameters and their standard errors. When cooking, the C code in the outfile is compiled and dynamically linked to the rest of the compiled dynr code. If the C functions have previously been compiled then the user can prevent re-compilation by setting compileLib = FALSE in the "dynrModel" object given to dynr.cook(). After compilation the C code is executed to optimize the free parameters while calling the dynamically linked C functions that were created from the user-specified recipes. In this way, dynr provides an R interface for dynamical systems modeling while maintaining much of the speed associated with C.

The final step associated with dynr modeling is serving results (a "dynrCook" object) after the model has been cooked. To this end, several standard, popular S3 methods are defined for the "dynrCook" class, including coef(), confint(), deviance(), logLik(), AIC(), BIC(), names(), nobs(), summary(), and vcov(). These methods perform the same tasks as their counterparts for regression models in R. Additionally, dynr provides a few other model-serving functions illustrated here: summary(), plot(), dynr.ggplot() (or autoplot()), plotFormula(), and printex(). The summary() method provides a table of free parameter names, estimates, standard errors, \(t\)-values, and Wald-type confidence intervals.

summary(yum)
Coefficients:
         Estimate Std. Error t value ci.lower ci.upper Pr(>|t|)
phi_1     0.26608    0.04953   5.372  0.16900  0.36315 5.33e-08 ***
phi_2     0.47395    0.04425  10.711  0.38722  0.56068  < 2e-16 ***
beta_2    0.46449    0.04394  10.571  0.37837  0.55061  < 2e-16 ***
mu_1      4.55354    0.02782 163.658  4.49901  4.60807  < 2e-16 ***
mu_2      4.74770    0.14250  33.318  4.46842  5.02699  < 2e-16 ***
dynNoise  0.20896    0.01129  18.504  0.18683  0.23110  < 2e-16 ***
c11       5.50199    0.70939   7.756  4.11160  6.89237  < 2e-16 ***
c21      -5.16170    1.00424  -5.140 -7.12998 -3.19342 1.79e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

-2 log-likelihood value at convergence = 1002.52
AIC = 1018.52
BIC = 1054.87

These parameter estimates, standard errors, and likelihood values closely mirror those reported in Yang and Chow (2010 755–756). In the Deactivated regime, the autoregressive parameter (phi_1) and the intercept (mu_1) are lower than those in the Activated regime. So, neighboring EMG measurements are more closely related in the Activated regime and the overall level is slightly higher. This matches very well with the idea that the Activated regime consists of bursts of facial muscular activities and an elevated emotional state. Similarly, the effect of the self-reported emotional level is positive in the Activated regime and fixed to zero in the Deactivated regime, as the freely estimated value was close to zero with a nonsignificant \(t\)-value. So, in the Deactivated regime the self-reported emotional level and the facial muscular activity decouple. The dynamic noise parameter gives a sense of the size of the intrinsic unmeasured disturbances that act on the system. These forces perturb the system with a typical magnitude of a little less than half a point on the EMG scale seen in Figure 1(A). Lastly, the log-odds parameters (c11 and c21) can be turned into the transition probability matrix yielding