This article illustrates the use of the bcmixed package and focuses on the two main functions: bcmarg
and bcmmrm
. The bcmarg
function provides inference results for a marginal model of a mixed effect model using the Box–Cox transformation. The bcmmrm
function provides model median inferences based on the mixed effect models for repeated measures analysis using the Box–Cox transformation for longitudinal randomized clinical trials. Using the bcmmrm
function, analysis results with high power and high interpretability for treatment effects can be obtained for longitudinal randomized clinical trials with skewed outcomes. Further, the bcmixed package provides summarizing and visualization tools, which would be helpful to write clinical trial reports.
Longitudinal data are often observed in medical or biological research. One of the most popular statistical models for studying longitudinal continuous outcomes is the linear mixed effect model. Several packages are available from CRAN that allow for the implementation of linear mixed effects models (e.g., nlme (Pinheiro et al. 2021), glme (Sam Weerahandi et al. 2021), lme4 (Bates et al. 2015), CLME (Jelsema and Peddada 2016), PLmixed (Rockwood and Jeon 2019), MCMCglmm (Hadfield 2010)).The linear mixed effect models assume that longitudinal outcomes follow a multivariate normal distribution. However, the distribution of the outcome is often right skewed in the medical and biological fields. Therefore, evaluating fixed effects based on the normal distribution theory may result in inefficient inferences such as power loss for some statistical tests. In addition, although a model-based mean for a certain level of the categorical exploratory variables is often estimated when applying the linear mixed effect model (e.g., the model-based mean for each treatment group of the last visit in a randomized clinical trial), the mean may be inadequate as a representative value for the skewed data.
The Box–Cox transformation (Box and Cox 1964) is often applied to skewed longitudinal data (Lipsitz et al. 2000) to reduce the skewness of a skewed outcome and apply existing statistical models based on a normal distribution. However, it is difficult to directly interpret the model mean estimated on the scale after applying some transformations to the outcome variable.
For the sake of the interpretability of the analysis results, (Maruo et al. 2015) propose a model median inference method on the original scale based on the Box–Cox transformation in the context of randomized clinical trials. (Maruo et al. 2017) extend this method to the framework of mixed effects models for repeated measures (MMRM) analysis (Mallinckrodt et al. 2001) in the context of longitudinal randomized clinical trials.
The bcmixed package
(Maruo et al. 2020) contains functions to estimate model medians for
longitudinal data proposed by (Maruo et al. 2017), as well as a sample data
set that is used in (Maruo et al. 2017). In this package, the parameter
estimation is conducted partially using the gls
function in the
nlme package. This paper
illustrates the usage of the package with the sample data in several
contexts.
The remainder of this manuscript is organized as follows: In section 2, we describe the methods proposed by (Maruo et al. 2017). Then, we illustrate the usage of the bcmixed package with the example data in section 3. Finally, our contributions are summarized in section 4.
We briefly introduce the method proposed by in (Maruo et al. 2017). The detailed expressions can be found in (Maruo et al. 2017).
Let the outcome be a continuous and positive value. The outcomes are measured over time for each subject \(i=1,\ldots,n\), and the number of planned measurement occasions is \(T\) (occasion index: \(t=1,\ldots,T\)). The outcome vector for the \(i\)th subject is denoted by \(\mathbf{y}_i=(y_{i1},\ldots,y_{in_i})^T\), where \(n_i\) is the number of measurements for the \(i\)th subject. We have \(n_i\le T\) because of missingness. Then, we consider applying the following model (Box and Cox 1964; Lipsitz et al. 2000): \[\label{model} \mathbf{z}_{i}=X_i\mathbf{\beta}+W_i\mathbf{b}_i+\mathbf{\epsilon}_i, \tag{1}\] where \(\mathbf{z}_i\) is the Box–Cox transformed outcome vector such that the \(j\)th element (\(j=1,\ldots,n_i\)) is denoted by \[z_{ij}=\left\{\begin{array}{ll} (y_{ij}^\lambda-1)/\lambda,&\lambda\neq 0,\\ \log y_{ij},&\lambda=0, \end{array} \right.\] and \(\lambda\) is the transformation parameter. The parameter \(\mathbf{\beta}\) is the \(p\)-dimensional vector of the fixed effect, \(\mathbf{b}_i\) is the \(q\)-dimensional vector of random effects distributed as \(MVN_q(\mathbf{0}_q, D)\), \(\mathbf{\epsilon}_i\) is the \(n_i\)-dimensional vector of random errors distributed independently as \(MVN_{n_i}(\mathbf{0}_{n_i}, \Sigma_i)\), and \(X_i\) and \(W_i\) are \(n_i \times p\) and \(n_i \times q\) design matrices relating the fixed and random effects, respectively. The random variables \(\mathbf{b}_i\) and \(\mathbf{\epsilon}_i\) are independent. From Formula ((1)), it is derived that \(\mathbf{z}_i|\lambda\sim MVN_{n_i}(X_i\mathbf{\beta}, V_i\)), where \(V_i = W_i D W_i^T + \Sigma_i\). In this paper, we consider situations where researchers have little interest in random effects and are interested in assessing fixed effects. In such cases, a simple formulation of the linear mixed effect model ((1)) can be implemented wherein the random effects are not explicitly modeled, but are rather included as part of the covariance matrix \(V_i\). We focus on such a “marginal” mean model. The covariance parameter vector in \(V=V_i\) for \(n_i=T\) (i.e., subjects with no missing values) is denoted as \(\mathbf{\alpha}=(\alpha_1,\ldots,\alpha_m)^T\). The dimension of \(\mathbf{\alpha}\), that is \(m\), depends on \(T\) and the specified covariance structure.
Let the model parameter vector be \(\mathbf{\theta}=(\lambda,\mathbf{\beta}^T,\mathbf{\alpha}^T)^T\). The maximum likelihood estimate of \(\mathbf{\theta}\) is obtained through maximizing the profile likelihood for \(\lambda\) (Lipsitz et al. 2000; Maruo et al. 2017). The model-based and robust variance estimators of the maximum likelihood estimator \(\hat{\mathbf{\theta}}\) are given by \(V_{\mathbf{\theta}}^{(M)}=\{-\hat{H}\}^{-1}\) and \(V_{\mathbf{\theta}}^{(R)}=\{-\hat{H}\}^{-1}\hat{J}\{-\hat{H}\}^{-1}\), respectively, where \(H=\frac{\partial^2}{\partial\mathbf{\theta}\partial\mathbf{\theta}^T}l(\mathbf{\theta})\), \(J=\sum_{i=1}^n\left\{\frac{\partial}{\partial\mathbf{\theta}}l_i(\mathbf{\theta})\right\}\left\{\frac{\partial}{\partial\mathbf{\theta}}l_i(\mathbf{\theta})\right\}^T\), \(l(\mathbf{\theta})\) is the likelihood function for \(n\) subjects, and \(l_i(\mathbf{\theta})\) is the likelihood function for the \(i\)th subject. The matrices \(\hat{H}\) and \(\hat{J}\) are obtained from the matrices \(H\) and \(J\) by replacing \(\mathbf{\theta}\) by \(\hat{\mathbf{\theta}}\), respectively. A robust variance estimator is an asymptotically valid estimator even when the model ((1)) is mis-specified (Cox 1961).
We now focus on the continuous and positive outcomes of a certain disease, and consider a situation in which the efficacy of some treatments (group index: \(g=1,...,G\)) is compared based on a randomized, parallel group clinical trial, where the total number of subjects is \(n\). The explanatory variable matrix, \(X_i\), and the fixed effect parameter, \(\mathbf{\beta}\) in model ((1)) are denoted in detail as follows. Now, \(X_i\) is given by the \(n_i\times(GT+C)\) matrix that contains the intercept, \(G-1\) group variables, \(T-1\) occasion variables, group-by-occasion interaction variables, and \(C\) covariates, where the categorical covariates are converted into dummy variables. The fixed effect parameter vector is given by \(\mathbf{\beta}=(\beta_I,\mathbf{\beta}_g^T,\mathbf{\beta}_t^T,\mathbf{\beta}_{gt}^T,\mathbf{\beta}_c^T)^T\), where \(\beta_I\), \(\mathbf{\beta}_g=(\beta_{g(1)},\ldots,\beta_{g(G-1)})^T\), \(\mathbf{\beta}_t=(\beta_{t(1)},\ldots,\beta_{t(T-1)})^T\), \(\mathbf{\beta}_{gt}=(\beta_{gt(1,1)},\beta_{gt(1,2)},\ldots,\beta_{gt(G-1,T-1)})^T\), and \(\mathbf{\beta}_c=(\beta_{c(1)},\ldots,\beta_{c(C)})^T\) are the intercept, group, occasion, group-by-occasion, and covariate parameter vectors, respectively. Although group \(G\) and occasion \(T\) is set to be at the reference level, we set \(\beta_{g(G)}=\beta_{t(T)}=\beta_{gt(G,t)}=\beta_{gt(g,T)}=0\) for the sake of description (\(g=1,\ldots,G,\) \(t=1,\ldots,T\)).
Under these settings, the model median, \(\xi_{(g,t)}\), for the treatment group \(g\) at the occasion \(t\) on the original scale is given by \[\xi_{(g,t)}=\left\{\lambda\left(\beta_I+\beta_{g(g)}+\beta_{t(t)}+\beta_{gt(g,t)}+\mathbf{x}_{\bar{c}}^T\mathbf{\beta}_c\right)+1\right\}^{1/\lambda},\] where \(\mathbf{x}_{\bar{c}}\) is the vector of the mean of each covariate for all subjects. The model median is the inverse Box–Cox transformation of the model means on the Box–Cox transformed scale. The model median can be easily interpreted because it is the estimator of the median on the original scale.
Using the delta method, the variance estimator of the maximum likelihood estimator for the model median, \(\hat{\xi}_{(g,t)}\), is given by \(V_{\xi(g,t)}^{(\cdot)}=\Delta_{\xi(g,t)}^TV_{\mathbf{\theta}}^{(\cdot)}\Delta_{\xi(g,t)},\) where \(\Delta_{\xi(g,t)}=\left.\frac{\partial}{\partial\mathbf{\theta}}\xi_{(g,t)}\right|_{\mathbf{\theta}=\hat{\mathbf{\theta}}}\). If we use \(V_{\mathbf{\theta}}^{(\cdot)}=V_{\mathbf{\theta}}^{(M)}\), we obtain the model-based variance estimator, \(V_{\xi(g,t)}^{(M)}\). On the other hand, the robust variance estimator, \(V_{xi(g,t)}^{(R)}\), is obtained if we use \(V_{\mathbf{\theta}}^{(\cdot)}=V_{\mathbf{\theta}}^{(R)}\).
Thus, the model median difference between groups \(g_1\) and \(g_2\) at occasion \(t\) is given by \(\delta_{(g_1;g_2,t)}=\xi_{(g_1,t)}-\xi_{(g_2,t)}\) and the variance estimators of the maximum likelihood estimator of the model median difference, \(\hat{\delta}_{(g_1;g_2,t)}\), is given by \(V_{\delta(g_1;g_2,t)}=(\Delta_{(g_1,t)}-\Delta_{(g_2,t)})^TV_{\mathbf{\theta}}^{(\cdot)}(\Delta_{(g_1,t)}-\Delta_{(g_2,t)})\). The model-based and robust variance estimators are obtained in the same way as the model median estimator. Using the asymptotic normality of the maximum likelihood estimator, the \(100(1-\alpha)\%\) confidence interval of \(\delta_{(g_1;g_2,t)}\) is obtained as \(\hat{\delta}_{(g_1;g_2,t)}\pm \Phi^{-1}(1-\alpha/2)\sqrt{V_{\delta(g_1;g_2,t)}^{(\cdot)}}\), where \(\Phi^{-1}(\cdot)\) is the quantile function of the standard normal distribution. The Wald-type test for the null hypothesis, \(\delta_{(g_1;g_2,t)}=0\), is performed with the test statistic \(t=\hat{\delta}_{(g_1;g_2,t)}/\sqrt{V_{\delta(g_1;g_2,t)}^{(\cdot)}}\).
The performances of the above-mentioned inference procedures may be low for a small sample because these are based on the asymptotic properties of the maximum likelihood estimation. Therefore, (Maruo et al. 2017) applied the following empirical small sample adjustment for the inferences of the model median differences by referring to the study in (Schluchter and Elashoff 1990).
They provide an adjusted standard error (SE) for the median difference
as \(\sqrt{M/(M-p)V_{\delta(g_1;g_2,t)}^{(\cdot)}}\) for the compound
symmetry (CS) or the first-order auto regression (AR(1)) structure and
\(\sqrt{n^*/(n^*-T)}\)
\(\times\sqrt{V_{\delta(g_1;g_2,t)}^{(\cdot)}}\) for the unstructured (UN)
structure. Further, we approximate the null distribution by the \(t\)
distribution where the degrees of freedom for the CS or the AR(1)
structure and the UN structure are \((n-G)(T-1)-m\) and \(n^*-T\),
respectively.
Although (Maruo et al. 2017) and (Maruo et al. 2020) focus only on the three covariance structures, these three options are sufficient in the applied settings because the MMRM analysis is often applied in the following steps. The UN structure is used, and the CS or AR(1) structure with the robust variance estimation is used when the parameter estimation process is not properly converged (e.g., see (Gosho et al. 2017)).
In this section, we describe the bcmixed package that provides the analysis results based on the mixed effect models with the Box–Cox transformation. The package bcmixed is available from the Comprehensive R Archive Network (CRAN) at https:// CRAN.R-project.org/package=bcmixed. The package bcmixed can be installed and loaded using the following code.
> install.packages("bcmixed")
R> library(bcmixed) R
In particular, the following main functions are demonstrated in this article:
bcmarg
: Inference on the marginal model of the mixed effect model with the Box–Cox transformation.
bcmmrm
: Inference on the model median for longitudinal data in randomized clinical trials.
First, we illustrate a real example for the acquired immune deficiency
syndrome (AIDS) clinical trial data, which is stored as the data frame
aidscd4
in the bcmixed package. The data are from a randomized,
double-blind study of AIDS patients with advanced immune suppression
(cluster of differentiation 4 (CD4) cells count of less than or equal to
50 cells/mm3) (Henry et al. 1998; Fitzmaurice et al. 2011). Patients in the
AIDS clinical trial group study 193A were randomized to dual or triple
combinations of human immunodeficiency virus-1 reverse transcriptase
inhibitors. In particular, the patients were randomized to one of four
daily regimens. The original data can be downloaded from
https://content.sph.harvard.edu/fitzmaur/ala/ (Datasets->AIDS
Clinical Trial Group (ACTG) Study 193A). As for the more detailed data
handling process, see (Maruo et al. 2017). The data frame aidscd4
contains seven variables (Table 1).
Variable | Description |
---|---|
id |
A patient identifier; in total there are 1177 patients. |
weekc |
A visit variable (weeks 8, 16, 24, 32). |
treatment |
Allocated treatment regimens; |
1 = zidovudine alternating monthly with 400mg didanosine, |
|
2 = zidovudine plus 2.25mg of zalcitabine, |
|
3 = zidovudine plus 400mg of didanosine, and |
|
4 = zidovudine plus 400mg of didanosine plus 400mg of nevirapine. |
|
age |
Patients’ age (years). |
sex |
Patients’ sex (1 = male, 0 = female). |
cd4.bl |
A baseline value of CD4 cells count + 1. |
cd4 |
A CD4 cells count + 1. |
Figure 1 shows the longitudinal box-whisker plot of the values of CD4 plus 1 for each group plotted using ggplot2 package (Wickham 2016). The distribution shapes of the measurements were heavily skewed.
This function provides the inference results for the marginal model of the mixed effect model with the Box–Cox transformation described in Section 2.1.
The bcmarg
function is called using the following syntax.
bcmarg(formula, data, time = NULL, id = NULL, structure = "UN",
lmdint = c(-3, 3))
The bcmarg
function takes arguments tabulated in Table
2.
Argument | Description |
---|---|
formula |
A two-sided linear formula object describing the model, with the response |
on the left of a ~ operator and the terms, separated by + operators, on the right. |
|
data |
A data frame containing the variables used in the model. |
time |
A time variable name for repeated measurements. The default is NULL . |
id |
A subject id variable name for repeated measurements. The default is NULL . |
structure |
Specify the covariance structure from c("UN", "CS", "AR(1)") . The default is "UN" . |
lmdint |
A vector containing the end points of the interval to be searched for a |
transformation parameter. The default is c(-3, 3) . |
If time
and id
are not specified, an ordinary linear model with the
Box–Cox transformation is applied.
The bcmarg
function returns an object of class "bcmarg"
. The objects
of this class have methods for generic functions coef
, logLik
,
vcov
, fitted
, print
, and summary
. The object includes the
components for the marginal model parameter inference (Table
3).
Component | Description |
---|---|
lambda |
A numeric value of the estimate of the transformation parameter. |
beta |
A vector with the estimates of the regression parameters. |
alpha |
A vector with the estimates of the covariance parameters. |
V |
A variance-covariance matrix for any subject with no missing values. |
betainf |
A matrix containing the inference results for beta under the assumption |
that lambda is known. |
|
Vtheta.mod |
A model-based variance-covariance matrix for MLE of the vector of all |
parameters: c(lambda, beta, alpha) . |
|
Vtheta.rob |
A robust variance-covariance matrix for MLE of the vector of all parameters. |
logLik |
A numeric value of the maximized likelihood. |
adj.prm |
A vector with parameters used for the empirical small sample adjustment in |
bcmmrm : c(number of subjects, number of completed subjects, number of |
|
outcome observations, number of missing observations). | |
glsObject |
An object of "gls" (or "lm" when time and id are not specified) containing |
results of gls (or lm ) function on the transformed scale. |
In bcmarg
function, lambda
is estimated with the optimize
function
by maximizing the profile likelihood function for \(\lambda\). If an error
occurs in the optimize
function, problems may be solved by changing
the search area for \(\lambda\), lmdint
.
We applied a marginal model to the aidscd4
data, where the fixed
effects were the treatment, visit, treatment-visit interaction, and the
Box–Cox transformed baseline, where the visit was treated as a nominal
variable. The covariance structure was set as unstructured (default
setting). This model is frequently used for MMRM analysis. A sample code
is as follows. The bct.v
function returns the Box–Cox transformed
vector.
> data("aidscd4")
R> aidscd4$cd4.bl.tr <- bct.v(aidscd4$cd4.bl)$transformed
R> res1 <- bcmarg(formula = cd4 ~ as.factor(treatment) * as.factor(weekc) +
R+ cd4.bl.tr, data = aidscd4, time = weekc, id = id)
The summarized analysis results on the transformed scale are obtained by
applying the summary
function to the "bcmarg"
object as follows.
> summary(res1)
R
-Cox transformed mixed model analysis
Box: cd4 ~ as.factor(treatment) * as.factor(weekc) + cd4.bl.tr
Formula: weekc
Time: id
ID: "UN"
Covariance structure: aidscd4
Data-likelihood: -13322.96
Log: 0.154
Estimated transformation parameter
:
Coefficients on the transformed scale-value p-value
Value Std.Error t1.0849 0.1249 8.684 0.000
(Intercept) as.factor(treatment)2 0.2454 0.1214 2.022 0.043
as.factor(treatment)3 0.4203 0.1212 3.468 0.001
as.factor(treatment)4 0.7649 0.1199 6.377 0.000
as.factor(weekc)16 -0.2043 0.0843 -2.424 0.015
as.factor(weekc)24 -0.4498 0.0899 -5.004 0.000
as.factor(weekc)32 -0.6750 0.0988 -6.835 0.000
0.5782 0.0200 28.922 0.000
cd4.bl.tr as.factor(treatment)2:as.factor(weekc)16 -0.1183 0.1185 -0.998 0.318
as.factor(treatment)3:as.factor(weekc)16 -0.0560 0.1180 -0.475 0.635
as.factor(treatment)4:as.factor(weekc)16 0.0675 0.1175 0.574 0.566
as.factor(treatment)2:as.factor(weekc)24 -0.1863 0.1264 -1.474 0.141
as.factor(treatment)3:as.factor(weekc)24 -0.0243 0.1264 -0.193 0.847
as.factor(treatment)4:as.factor(weekc)24 0.0870 0.1263 0.689 0.491
as.factor(treatment)2:as.factor(weekc)32 -0.0852 0.1414 -0.603 0.547
as.factor(treatment)3:as.factor(weekc)32 -0.0893 0.1400 -0.638 0.524
as.factor(treatment)4:as.factor(weekc)32 0.1414 0.1381 1.024 0.306
:
Covariance parameters on the transformed scaleUN(1,1) UN(1,2) UN(1,3) UN(1,4) UN(2,2) UN(2,3) UN(2,4) UN(3,3) UN(3,4) UN(4,4)
1.798 1.156 1.105 0.927 1.957 1.346 1.298 1.903 1.452 2.009
:
Correlations on the transformed scale8 16 24 32
8 1.000 0.616 0.598 0.488
16 0.616 1.000 0.697 0.655
24 0.598 0.697 1.000 0.743
32 0.488 0.655 0.743 1.000
The results of coefficients on the transformed scale are obtained with
the gls
function in the
nlme package. The
transformation parameter was estimated as 0.154, which suggested that
the shape of the residual distribution was close to a multivariate
log-normal distribution. All main effects were significant; however,
treatment-by-week interaction was not significant at a level of 0.05.
Note that inference results for beta
under the assumption that the
transformation parameter is known are provided. Although statistical
tests would be asymptotically valid (e.g., see (Doksum and Wong 1983)),
standard errors might be underestimated (e.g., see (Bickel and Doksum 1981)).
This function provides the results for the model median inferences for
longitudinal randomized clinical trial data described in Section
2.2. The parameter inference is conducted by calling the
bcmarg
function in the bcmmrm
function.
The bcmmrm
function is called using the following syntax.
bcmmrm(outcome, group, data, time = NULL, id = NULL, covv = NULL,
cfactor = NULL, structure = "UN", lmdint = c(-3, 3), glabel = NULL,
tlabel = NULL)
The bcmmrm
function takes arguments tablated in Table
4.
Argument | Description |
---|---|
outcome |
A name of positive outcome (dependent) variable included in data . |
group |
A name of treatment group variable included in data . |
data |
A data frame that may include outcome , group , time , id , and specified covariate |
variables. | |
time |
A name of time variable for repeated measurements included in data . |
The default is NULL . |
|
id |
A name of subject id variable for repeated measurements included in data . |
The default is NULL . |
|
covv |
A character vector with names of covariate variables included in data . |
The default is NULL . |
|
cfactor |
An integer vector including nominal variable indicators for covariate variables. |
Nominal variable: 1 , continuous variable: 0 . The default is NULL . |
|
structure |
Specify the covariance structure from c("UN", "CS", "AR(1)") . The default is "UN" . |
conf.level |
A numeric value of the confidence level for the confidence intervals. |
The default is 0.95. | |
lmdint |
A vector containing end points of the interval to be searched for a transformation |
parameter. The default is c(-3, 3) . |
|
glabel |
A vector of length number of treatment groups containing the labels of group |
variable. The default is NULL and the levels of group variable in data are used. |
|
tlabel |
A vector of length number of repeated measures containing the labels of time |
variable. The default is NULL and the levels of time variable in data are used. |
If time
and id
are not specified, inference results reduce to the
results for the context of linear regression model provided by
(Maruo et al. 2015).
The bcmmrm
function returns an object of class "bcmmrm"
representing
the results of model median inference based on the Box–Cox transformed
MMRM analysis. Generic functions boxplot
, coef
, logLik
, vcov
,
fitted
, plot
, print
, and summary
have methods to demonstrate the
results of the fit. Components tablated in Table 5
must be included in a legitimate "bcmmrm"
object.
Component | Description |
---|---|
call |
A list containing an image of the bcmmrm call that produced the object. |
median.mod , |
Lists including inference results for the model medians for all groups. |
median.rob , |
Levels of the list are time points, where the correspondence table is given |
median.mod.adj , |
as time.tbl$code . mod: model-based inference, rob: robust inference, |
median.rob.adj |
adj: with empirical small sample adjustment. |
meddif.mod , |
Lists including inference results for the for the model median differences |
meddif.rob , |
between all pairs of groups (group1 - group0). The levels of the list are |
meddif.mod.adj , |
time points, where the correspondence table is given as time.tbl$code . |
meddif.rob.adj |
mod : model-based inference, rob : robust inference, |
adj : with empirical small sample adjustment. |
|
lambda |
A numeric value of estimates of the transformation parameter. |
R |
A correlation matrix for transformed outcomes. |
betainf |
Inference results for beta under the assumption that lambda is known. |
time.tbl |
A data frame of a correspondence table for the time points. |
This object is used when applying the above generic functions. | |
group.tbl |
A data frame of a correspondence table for treatment groups. |
This object is used when applying the above generic functions. | |
inf.marg |
A "bcmarg" object with results of the bcmarg function called in the |
bcmmrm function. |
|
outdata |
A data frame where the transformed outcome (ytr ), the fitted values |
on the transformed scale (ytr.fitted ), and the residuals on |
|
the transformed scale (res.tr ) are added to input data. |
|
conf.level |
A numeric value of the specified confidence level. |
lambda
, R
, betainf
, and inf.marg
are obtained from the results
of the bcmarg
function that is called in the bcmmrm
function.
We applied the MMRM analysis with the Box–Cox transformation described
in Section 3.2 to the aidscd4
data, where the Box–Cox
transformed baseline and sex were included as the covariates. The
example code is as follows.
> data("aidscd4")
R> aidscd4$cd4.bl.tr <- bct.v(aidscd4$cd4.bl)$transformed
R> res2 <- bcmmrm(outcome = cd4, group = treatment, data = aidscd4, time = weekc,
R+ id = id, covv = c("cd4.bl.tr", "sex"), cfactor = c(0, 1),
+ glabel = c("Zid/Did", "Zid+Zal", "Zid+Did", "Zid+Did+Nev")
The transformed baseline and sex are continuous and categorical
variables, respectively, and therefore, cfactor
was set as c(0,1)
.
The print
function only provides information about model detail, the
estimated transformation parameter, the maximized log-likelihood, and
the model median estimate for each time point and group as follows.
> print(res2)
R
-Cox transformation
Model median estimation based on MMRM with Box: cd4
Outcome: treatment
Group: weekc
Time: id
IDCovariate(s): c("cd4.bl.tr", "sex")
: "UN"
Covariance structure: aidscd4
Data: 0.154
Estimated transformation parameter-likelihood: -13322.36
Log
estimates (row: group, col: time):
Model median | weekc 8 16 24 32
treatment 1 Zid/Did 18.9 16.5 14.1 12.1
2 Zid+Zal 22.0 17.9 14.6 13.5
3 Zid+Did 24.5 20.9 18.2 15.1
4 Zid+Did+Nev 30.1 27.8 24.2 21.8
The summary
function provides more detailed analysis results as
follows.
> summary(res2)
R
-Cox transformation
Model median inference based on MMRM with the Box
:
Data and variable information: cd4
Outcome: treatment
Group: weekc
Time: id
IDCovariate(s): c("cd4.bl.tr", "sex")
: aidscd4
Data
:
Analysis information: "UN"
Covariance structure: TRUE
Robust inference: TRUE
Empirical small sample adjustment: 0.95
Confidence level
:
Analysis results: 0.154
Estimated transformation parameter
for weekc = 8
Model median inferences
treatment median SE lower.CL upper.CL1 Zid/Did 18.9 0.862 17.2 20.6
2 Zid+Zal 22.0 1.124 19.8 24.2
3 Zid+Did 24.5 1.465 21.6 27.4
4 Zid+Did+Nev 30.1 1.597 27.0 33.3
for weeks 16 and 24...)
(...Omitted
for weekc = 32
Model median inferences
treatment median SE lower.CL upper.CL1 Zid/Did 12.1 0.662 10.8 13.4
2 Zid+Zal 13.5 0.813 11.9 15.1
3 Zid+Did 15.1 1.019 13.1 17.1
4 Zid+Did+Nev 21.8 1.376 19.1 24.5
Inferences of model median difference between groups 1 - treatment 0 ) for weekc = 8
( treatment
1 treatment 0 delta SE lower.CL upper.CL t.value p.value
treatment 1 Zid+Zal Zid/Did 3.12 1.40 0.363 5.87 2.22 0.027
2 Zid+Did Zid/Did 5.64 1.69 2.325 8.96 3.34 0.001
3 Zid+Did+Nev Zid/Did 11.25 1.80 7.711 14.80 6.24 0.000
4 Zid+Did Zid+Zal 2.53 1.83 -1.059 6.12 1.39 0.167
5 Zid+Did+Nev Zid+Zal 8.14 1.93 4.349 11.93 4.22 0.000
6 Zid+Did+Nev Zid+Did 5.61 2.16 1.372 9.85 2.60 0.010
for weeks 16 and 24...)
(...Omitted
Inferences of model median difference between groups 1 - treatment 0 ) for weekc = 32
( treatment
1 treatment 0 delta SE lower.CL upper.CL t.value p.value
treatment 1 Zid+Zal Zid/Did 1.35 1.04 -0.692 3.40 1.30 0.194
2 Zid+Did Zid/Did 3.00 1.20 0.633 5.37 2.49 0.013
3 Zid+Did+Nev Zid/Did 9.70 1.52 6.705 12.69 6.37 0.000
4 Zid+Did Zid+Zal 1.65 1.30 -0.907 4.20 1.27 0.206
5 Zid+Did+Nev Zid+Zal 8.34 1.60 5.206 11.48 5.23 0.000
6 Zid+Did+Nev Zid+Did 6.70 1.71 3.338 10.06 3.92 0.000
Significant differences were detected for the following pairs Zid+Did vs. Zid/Did, Zid+Did+Nev vs. Zid/Did, Zid+Did+Nev vs. Zid+Zal, and Zid+Did+Nev vs. Zid+Did at week 32.
The summary
function provides the results using the robust variance
and the small sample adjustment in the default settings. If users want
to summarize results using the model variance and not using the small
sample adjustment, specify
summary(bcmmrmObject, robust = F, ssadjust = F)
. Further details of
the summary
function for the "bcmmrm"
object can be obtained with
?summary.bcmmrm
.
The inference results for the median differences at week 32 (fourth visit) can also be called as follows although the levels of the group variable in the data frame are used without formatting.
> res2$meddif.rob.adj[[4]]
R
group1 group0 delta SE lower.CL upper.CL t.value p.value1 2 1 1.354438 1.041338 -0.6922404 3.401117 1.300672 1.940595e-01
2 3 1 3.000942 1.204919 0.6327547 5.369129 2.490575 1.312570e-02
3 4 1 9.697631 1.522831 6.7046091 12.690653 6.368159 4.869820e-10
4 3 2 1.646503 1.299231 -0.9070469 4.200054 1.267291 2.057293e-01
5 4 2 8.343192 1.596236 5.2058987 11.480486 5.226792 2.682862e-07
6 4 3 6.696689 1.709027 3.3377114 10.055667 3.918421 1.034880e-04
The "bcmmrm"
object can be plotted with the plot
function as follows
(Figure 2).
> plot(res2, xlab = "Week", ylab = "CD4+1") R
The plot
function provides a longitudinal plot in the default
settings. However, a plot at a specified time point can be drawn with
the following code (Figure 3):
> plot(res2, timepoint = 32, xlab = "Treatment", ylab = "CD4+1", col = 1:4) R
Further, the plot
function provides the results using the robust
variance and the small sample adjustment in the default setting. Many
other options such as main
and legend
can be used in the plot
function. Further details can be obtained with ?plot.bcmmrm
.
A diagnosis of a model fitting can be conducted with the boxplot
function, which provides a box-whisker plot of the Box–Cox transformed
residuals for each group. A sample code is provided as follows (Figure
4):
> boxplot(res2) R
The shape of the transformed residual for each group is not skewed and
the median and mean are close to each other, which suggests that the
median would not be biased at week 32. The boxplot
function provides
the results at the last time points in the default settings. A
box-whisker plot at another time point can be obtained by specifying
boxplot(bcmmrmObject, timepoint = xx)
.
We demonstrated the use of the bcmixed package. The bcmarg
function
provides the analysis results for a marginal model of a mixed effect
model with the Box–Cox transformation. The results for the statistical
test of the bcmarg
function are meaningful. However, it is difficult
to interpret coefficients (\(\mathbf{\beta}\)) on the transformed scale.
The bcmmrm
function provides the model median inferences based on the
MMRM with the Box–Cox transformation for longitudinal randomized
clinical trials. Using this function, treatment effects can be
interpreted as median differences between treatment groups at specified
time points. (Maruo et al. 2017) also show that this method controls the
type I error of the statistical test for the model median difference,
and it has moderate or high performance for power compared with the
existing methods (ordinary MMRM, MMRM for the log-transformed outcome,
etc.) from their simulation studies (the simulation program is provided
in https://github.com/kzkzmr/Maruo_2017_StatMed_Simulation with
penalized power results proposed by (Cavus et al. 2019)). Thus, bcmmrm
function analysis results with high power and high interpretability for
longitudinal randomized clinical trials with skewed outcomes. Further,
the bcmmrm
function provides summarizing and visualization tools,
which would be helpful to write clinical trial reports.
Although the bcmixed package can be used for data other than that of
randomized clinical trials, the performances of these methods have not
been evaluated well yet. Therefore, users should use them carefully.
Although a model fitting can be diagnosed with the boxplot
function,
more model diagnosis tools may be implemented in the future. Users might
apply the MissMech
package(Jamshidian et al. 2014, 2015), which diagnoses
multivariate normality and heteroscedasticity, to the transformed
residuals stored in "bcmarg"
object. Note that the MissMech package
assumes missing mechanisms are missing completely at random (MCAR), and
statistical tests for model fittings may lead to significant results for
a medium-to-large sample even when the models fit the data adequately.
This work was supported by JSPS KAKENHI Grant Number JP19K11849.
bcmixed, nlme, glme, lme4, CLME, PLmixed, MCMCglmm, ggplot2, MissMech
Bayesian, ChemPhys, Econometrics, Environmetrics, Finance, MixedModels, OfficialStatistics, Phylogenetics, Psychometrics, Spatial, SpatioTemporal, Survival, TeachingStatistics
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Maruo, et al., "bcmixed: A Package for Median Inference on Longitudinal Data with the Box--Cox Transformation", The R Journal, 2021
BibTeX citation
@article{RJ-2021-083, author = {Maruo, Kazushi and Ishii, Ryota and Yamaguchi, Yusuke and Gosho, Masahiko}, title = {bcmixed: A Package for Median Inference on Longitudinal Data with the Box--Cox Transformation}, journal = {The R Journal}, year = {2021}, note = {https://rjournal.github.io/}, volume = {13}, issue = {2}, issn = {2073-4859}, pages = {253-265} }