Protein structure data consist of several dihedral angles, lying on a multidimensional torus. Analyzing such data has been and continues to be key in understanding functional properties of proteins. However, most of the existing statistical methods assume that data are on Euclidean spaces, and thus they are improper to deal with angular data. In this paper, we introduce the package *ClusTorus* specialized to analyzing multivariate angular data. The package collects some tools and routines to perform algorithmic clustering and model-based clustering for data on the torus. In particular, the package enables the construction of conformal prediction sets and predictive clustering, based on kernel density estimates and mixture model estimates. A novel hyperparameter selection strategy for predictive clustering is also implemented, with improved stability and computational efficiency. We demonstrate the use of the package in clustering protein dihedral angles from two real data sets.

Multivariate angular or circular data have found applications in some research domains including geology (e.g., paleomagnetic directions) and bioinformatics (e.g., protein dihedral angles). Due to the cyclic nature of angles, usual vector-based statistical methods are not directly applicable to such data. A \(p\)-variate angle \(\theta = (\theta_1,\cdots,\theta_p)^T\) lies on the \(p\)-dimensional torus \(\mathbb{T}^p = [0,2\pi)^p\) in which the angles \(0\) and \(2\pi\) are identified as the same point. Likewise, angles \(\theta\) and \(\theta\pm 2\pi\) are the same data point on the torus. Thus, statistical models and predictions on the torus should reflect this geometric constraint.

A prominent example in which multivariate angular data appear is the
analysis of protein structures. As described in Branden and Tooze (1999), the functional
properties of proteins are determined by the ordered sequences of amino
acids and their spatial structures. These structures are determined by
several dihedral angles, and thus, protein structures are commonly
described on multidimensional tori. The \(p\)-dimensional torus
\(\mathbb{T}^p\) is the sample space we consider in this paper.
Especially, for the 2-dimensional case, the backbone chain angles
\(\phi,\psi\) of a protein are commonly visualized by the Ramachandran
plot, a scatter plot of dihedral angles in a 2-dimensional flattened
torus \(\mathbb{T}^2\) (Lovell et al. 2003; Oberholser 2010).
In Figure 1, several clustering results are visualized on
the Ramachandran plot for the protein angles of SARS-CoV-2 virus, which
caused the 2020-2021 pandemic (Coronaviridae Study Group of the International Committee on Taxonomy of Viruses. et al. 2020). Since the
structures in protein angles are related to functions of the protein, it
is of interest to analyze the scatter of the angles through, for
example, density estimation and clustering. Note that the protein
structure data are routinely collected and publicly available at Protein
Data Bank (Berman et al. 2003) and importing such data into R is made easy by the
package *bio3d*
(Grant et al. 2006; Grant et al. 2021).

We introduce the R package
*ClusTorus* (Jung and Hong 2021)
which provides various tools for handling and clustering multivariate
angular data on the torus. The package provides angular adaptations of
usual clustering methods such as the \(k\)-means clustering and pairwise
angular distances, which can be used as an input for distance-based
clustering algorithms, and implements a novel clustering method based on
conformal prediction framework (Vovk et al. 2005). Also implemented in the package
are the EM algorithms and an elliptical \(k\)-means algorithm for fitting
mixture models on the torus, and a kernel density estimation. We will
introduce various clustering tools implemented in the package,
explaining choices in conformal prediction using two sets of example
data. We also present the theoretical and technical background, and
demonstrate these tools with R codes.

For data on the torus, there are a few previous works for mixture
modeling and clustering. Mardia et al. (2007) proposed a mixture of bivariate
von Mises distributions for data on \(\mathbb{T}^2\), with an application
to modeling protein backbone chain angles. Mardia et al. (2012) proposed a
density estimation on the torus, based on a mixture of approximated von
Mises sine distributions, for higher dimensional cases, but the proposed
EM algorithm tends to be unstable when sample sizes are limited. The R
package *BAMBI*
(Chakraborty and Wong 2019, 2020) provides routines to fit such von Mises
mixture models using MCMC, but is only applicable to bivariate (and
univariate) angles in \(\mathbb{T}^2\). We have implemented EM algorithms
(for \(p=2\)) and the elliptical \(k\)-means algorithm (for any \(p\)),
originally proposed for vector-valued data
(Sung and Poggio 1998; Bishop 2006; Shin et al. 2019), for fitting mixture models on the
torus. To the best of authors’ knowledge, *ClusTorus* is the first
implementation of methods for fitting mixture models on multidimensional
tori of any dimension.

Algorithmic clustering for data on the torus has also been proposed. For
example, Gao et al. (2018) used an extrinsic \(k\)-means algorithm for clustering
protein angles. While this algorithm is not always satisfactory, it is
implemented in *ClusTorus* for a quick-and-dirty analysis. The top right
panel of Figure 1 depicts the result of applying this
algorithm with \(k = 3\). Note that the popular R packages
*mixtools* (Benaglia et al. 2009)
and *mclust* (Scrucca et al. 2016)
provide misleading clustering results, when applied to data on the
torus. As we illustrate in Figure 1, these tools do not
take into account the cyclic nature of the angular data.

The main contribution of *ClusTorus* is an implementation of the
predictive clustering approaches of Jung et al. (2021) and Shin et al. (2019). For this,
the conformal prediction framework of Vovk et al. (2005) is extended for multivariate
angular data. The conformal prediction is a distribution-free method of
constructing prediction sets, and our implementation uses kernel density
estimates and mixture models, both based on the multivariate von Mises
distribution (Mardia et al. 2012). Furthermore, by using Gaussian-like
approximations of the von Mises distributions and a graph-theoretic
approach, flexible clusters, composed of unions of ellipsoids on
\(\mathbb{T}^p\), can be identified. The proposed predictive clustering
can be obtained by simply using `clus.torus`

as follows.

```
library(ClusTorus)
set.seed(2021)
<- clus.torus(SARS_CoV_2)
ex plot(ex)
```

The result of the predictive clustering is visualized in the top left
panel of Figure 1, which is generated by `plot(ex)`

. The
dataset `SARS_CoV_2`

, included in *ClusTorus*, collects the dihedral
angles \(\phi,\psi\) in the backbone chain B of SARS-CoV-2 spike
glycoprotein. The raw coronavirus protein data are available at Protein
Data Back with id 6VXX (Walls et al. 2020), and can be retrieved by using R
package *bio3d*. The function `clus.torus`

performs three core
procedures—conformal prediction, hyperparameter selection and cluster
assignment—for predictive clustering.

The rest of this article focuses on introducing the three core
procedures: (i) the conformal prediction framework, including our
choices of the conformity scores, (ii) hyperparameter selection and
(iii) cluster assignment. After demonstrating how the package
*ClusTorus* can be used for clustering of \(\mathbb{T}^2\)- and
\(\mathbb{T}^4\)-valued data, we describe how the main function
`clus.torus`

and other clustering algorithms such as \(k\)-means and
hierarchical clustering can be used to analyze data on the torus. In the
Appendix, we provide technical details and options in fitting mixture
models on the torus, and a list of S3 classes defined in *ClusTorus*.

The conformal prediction framework (Vovk et al. 2005) is one of the main ingredients of our development. Based on the work of Vovk et al. (2005) and Lei et al. (2013, 2015), we briefly introduce the basic concepts and properties of conformal prediction. Suppose that we observe a sample of size \(n\), \(X_i\sim F\) where \(X_i\in \mathbb{T}^p\) for each \(i\) and that the sequence \(\mathbb{X}_n =\left\{X_1,\cdots,X_n\right\}\) is exchangeable. Then, for a new \(X_{n+1}\sim F\), the prediction set \(C_n = C_n\left(\mathbb{X}_n\right)\) is said to be valid at level \(1-\alpha\) if: \[\begin{aligned} \label{eq:1} P\left(X_{n+1}\in C_n\right)\ge 1-\alpha,\quad \alpha\in \left(0,1\right), \end{aligned} \tag{1}\] where \(P\) is the corresponding probability measure for \(\mathbb{X}_{n+1}=\mathbb{X}_n\cup\left\{X_{n+1}\right\}\).

For a given \(x\in \mathbb{T}^p\), write \(\mathbb{X}_{n}(x)=\mathbb{X}_n\cup\left\{x\right\}\). Consider the null hypothesis \(H_0: X_{n+1}=x\), where \(X_{n+1}\sim F\). To test the hypothesis, the conformal prediction framework uses conformity scores \(\sigma_i\) defined as follows: \[\begin{aligned} \sigma_i\left(x\right) &:= g\left(X_i, \mathbb{X}_{n}\left(x\right)\right),~ \forall i=1,\cdots,n+1,\\ \sigma\left(x\right) &:=g\left(x, \mathbb{X}_{n}\left(x\right)\right) = \sigma_{n+1}\left(x\right), \end{aligned}\] for some real valued function \(g\), which measures the conformity or similarity of a point to the given set. If \(X_{\left(1\right)},\cdots,X_{\left(n+1\right)}\) are ordered to satisfy \(\sigma_{\left(1\right)}\le\cdots\le\sigma_{\left(n+1\right)}\) for \(\sigma_{\left(i\right)} = g\left(X_{\left(i\right)}, \mathbb{X}_{n+1}\right)\), then we may say that \(X_{\left(n+1\right)}\) is the most similar point to \(\mathbb{X}_{n+1}\).

Consider the following quantity: \[\begin{aligned} \pi\left(x\right) = \frac{1}{n+1}\sum_{i=1}^{n+1}I\left(\sigma_i\left(x\right)\le\sigma_{n+1}\left(x\right)\right),\quad I(A) = \begin{cases} 1,\quad A\text{ is true,}\\ 0,\quad \text{otherwise,} \end{cases} \end{aligned}\] which can be understood as a p-value for the null hypothesis \(H_0\). The conformal prediction set of level \(1-\alpha\) is constructed as \[\begin{aligned} C^{\alpha}_n=\left\{x:\pi\left(x\right)>\alpha\right\}. \end{aligned}\] Because the sequence \(\mathbb{X}_n(x)\) is exchangeable under \(H_0\), \(\pi\left(x\right)\) is uniformly distributed on \(\left\{\frac{1}{n+1},\cdots,1\right\}\). With this property, it can be shown that the conformal prediction set is valid for finite samples, i.e., (1) holds with \(C_n\) replaced by \(C_n^{\alpha}\) for any \(F\), that is, the prediction set is distribution-free (Lei et al. 2013). The performance of the conformal prediction highly depends on the choice of conformity score \(\sigma\). In some previous works on conformal prediction (Lei et al. 2013, 2015; Shin et al. 2019; Jung et al. 2021), the quality of prediction sets using density based conformity scores has been satisfactory.

We demonstrate a construction of the conformal prediction set with a
kernel density estimate-based conformity score, defined later in
(2), for the data shown in Figure 1. With the
conformity score given by (2), `cp.torus.kde`

computes the
conformal prediction set \(C_n^\alpha\) at a given level \(1-\alpha\)
(\(\alpha=0.1\) below), by performing the kernel density estimation. The
function tests the inclusion in \(C_n^\alpha\) of each point \((\phi,\psi)\)
over a fine grid of \(\mathbb{T}^2\), and the result of the testing is
shown as the boolean indices in the column `Cn`

of the output below. The
columns `Lminus`

and `Lplus`

provide approximated prediction sets,
defined in Jung et al. (2021).

```
<- cp.torus.kde(SARS_CoV_2)
cp.kde
cp.kde
sets (Lminus, Cn, Lplus) based on kde with concentration 25
Conformal prediction
= 0.1 :
Testing inclusion to the conformal prediction set with level -------------
phi psi Lminus Cn Lplus level1 0.00000000 0 FALSE FALSE FALSE 0.1
2 0.06346652 0 FALSE FALSE FALSE 0.1
3 0.12693304 0 FALSE FALSE FALSE 0.1
4 0.19039955 0 FALSE FALSE FALSE 0.1
5 0.25386607 0 FALSE FALSE FALSE 0.1
6 0.31733259 0 FALSE FALSE FALSE 0.1
7 0.38079911 0 FALSE FALSE FALSE 0.1
8 0.44426563 0 FALSE FALSE FALSE 0.1
9 0.50773215 0 FALSE FALSE FALSE 0.1
10 0.57119866 0 FALSE FALSE FALSE 0.1
9990 rows are omitted.
```

The concentration parameter \(\kappa\) of the kernel density estimation
and the level(s) of the prediction set can be designated by providing
arguments `concentration`

and `level`

. By default, these values are set
as `concentration = 25`

and `level = 0.1`

.

The output `cp.kde`

is an S3 object with class `cp.torus.kde`

, for which
a generic method `plot`

is available. The conformal prediction set for
`SARS_CoV_2`

data can be displayed on the Ramachandran plot, as follows.
The result is shown in Figure 2.

`plot(cp.kde)`

If the sample size \(n\) and the number \(N\) of grid points over \(\mathbb{T}^p\) are large, evaluating \(n + N\) conformity scores may take a long time. That is, constructing the conformal prediction set suffers from high computational costs. A workaround for this inefficiency is inductive conformal prediction, which enjoys significantly lower computational cost. The inductive conformal prediction framework is based on splitting the data into two sets. The algorithm for inductive conformal prediction is given in Algorithm 1.

**procedure**inductive conformal prediction(\(\left\{X_1,\cdots,X_n\right\}, \alpha, n_1 < n\))Split the data randomly into \(\mathbb{X}_1=\left\{X_1,\cdots, X_{n_1}\right\}\), \(\mathbb{X}_2=\left\{X_{n_1+1},\cdots, X_{n}\right\}\).

Construct \(\sigma\) with \(\sigma\left(x\right) = g\left(x,\mathbb{X}_1\right)\) for some function \(g\).

Put \(\sigma_i = g\left(X_{n_1+i},\mathbb{X}_1\right)\) and order as \(\sigma_{\left(1\right)}\le\cdots\le\sigma_{\left(n_2\right)}\), where \(n_2=n-n_1\).

Construct \(\hat{C}^{\alpha}_n = \left\{x:\sigma(x)\ge\sigma_{\left(i_{n_2,\alpha}\right)}\right\}\) where \(i_{n,\alpha} = \lfloor\left(n+1\right)\alpha\rfloor\).

**end procedure**

While the sizes \(n_1\) and \(n_2\) of two split data sets can be of any
size, they are typically set as equal sizes. It is well-known that the
output \(\hat{C}^\alpha_n\) of the algorithm also satisfies the
distribution-free finite-sample validity (Vovk et al. 2005; Lei et al. 2015). For fast
computation, the inductive conformal prediction is primarily used in
constructing prediction sets and clustering, in our implementation of
*ClusTorus*. Specifically, `icp.torus`

implements Algorithm
1 for several prespecified conformity
scores. As already mentioned, we need to choose the conformity score
\(\sigma\) carefully for better clustering performances.

Before we discuss our choices of the conformity scores, we first
illustrate how the functions in *ClusTorus* are used to produce
inductive conformal prediction sets. The following codes show a
calculation of the inductive conformal prediction set for the data
`SARS_CoV_2`

. The conformal prediction set with the conformity score
given by kernel density estimates (2) can be constructed by
`icp.torus`

and `icp.torus.eval`

. The function `icp.torus`

computes
\(\sigma_i\)’s in line 4 of Algorithm 1
and `icp.torus.eval`

tests whether pre-specified evaluation points are
included in \(\hat{C}_n^\alpha\). If these evaluation points are not
supplied, then `icp.torus.eval`

creates a grid of size \(100 \times 100\)
(for \(p = 2\)).

```
set.seed(2021)
<- icp.torus(SARS_CoV_2, model = "kde", concentration = 25)
icp.torus.kde <- icp.torus.eval(icp.torus.kde, level = 0.1)
icp.kde
icp.kde
set (Chat_kde)
Conformal prediction
= 0.1:
Testing inclusion to the conformal prediction set with level -------------
X1 X2 inclusion1 0.00000000 0 FALSE
2 0.06346652 0 FALSE
3 0.12693304 0 FALSE
4 0.19039955 0 FALSE
5 0.25386607 0 FALSE
6 0.31733259 0 FALSE
7 0.38079911 0 FALSE
8 0.44426563 0 FALSE
9 0.50773215 0 FALSE
10 0.57119866 0 FALSE
9990 rows are omitted.
```

In the codes above, the data splitting for `icp.torus`

is done
internally, and can be inspected by `icp.torus.kde$split.id`

.

We now introduce our choices for the conformity score \(\sigma\) in the next two subsections.

For the 2-dimensional case, Jung et al. (2021) proposed to use the kernel
density estimate based on the von Mises kernel (Marzio et al. 2011) for the
conformity score. A natural extension to the \(p\)-dimensional tori, for
\(p\ge 2\), is
\[\begin{aligned}
\label{eq:6}
g\left(u, \mathbb{X}_{n}\left(x\right)\right) = \frac{1}{n+1}\sum_{i=1}^{n+1}K_\kappa\left(u-X_i\right),\quad K_\kappa\left(v\right) = \prod_{i=1}^p\frac{e^{\kappa\cos\left(v_i\right)}}{2\pi I_0\left(\kappa\right)},\quad v = \left(v_1,\cdots,v_p\right)^T\in\mathbb{T}^p
\end{aligned} \tag{2}\]
where \(I_0\) is the modified Bessel function of the first kind of order
0, and \(\kappa\) is a prespecified concentration parameter. The function
`kde.torus`

provides the multivariate von Mises kernel density
estimation. For conformal prediction, we take
\(\sigma\left(x_i\right) = g\left(x_i,\mathbb{X}_n\left(x\right)\right)\),
and for inductive conformal prediction, we take
\(\sigma\left(x\right) = g\left(x,\mathbb{X}_1\right)\).

Our next choices of conformity scores are based on mixture models. Since the multivariate normal distributions are not defined on \(\mathbb{T}^p\), we instead use the multivariate von Mises distribution (Mardia et al. 2008), whose density on \(\mathbb{T}^p\) is \[\begin{aligned} \label{eq:7} f\left(y;\mu,\kappa, \Lambda\right) = C\left(\kappa,\Lambda\right)\exp\left\{-\frac{1}{2}\left[\kappa^T\left(2 - 2c\left(y,\mu\right)\right)+s\left(y,\mu\right)^T\Lambda s\left(y,\mu\right)\right]\right\} \end{aligned} \tag{3}\] where \(y = \left(y_1,\cdots,y_p\right)^T \in\mathbb{T}^p\), \(\mu = \left(\mu_1,\cdots,\mu_p\right)^T \in\mathbb{T}^p\), \({\kappa} = (\kappa_1,\ldots,\kappa_p)^T \in (0,\infty)^p\), \(\Lambda = (\lambda_{j,l})\) for \(1\le j,l \le p\), \(-\infty < \lambda_{jl} < \infty\), \[\begin{aligned} c\left(y,\mu\right) &= \left(\cos\left(y_1-\mu_1\right),\cdots, \cos\left(y_p-\mu_p\right)\right)^T,\\ s\left(y,\mu\right) &= \left(\sin\left(y_1-\mu_1\right),\cdots, \sin\left(y_p-\mu_p\right)\right)^T,\\ \left(\Lambda\right)_{jl}&=\lambda_{jl}=\lambda_{lj},~j\ne l,\quad \left(\Lambda\right)_{jj}=\lambda_{jj}=0, \end{aligned}\] and for some normalizing constant \(C\left(\kappa,\Lambda\right)>0\). We write \(f\left(y;\theta\right) = f\left(y;\mu,\kappa,\Lambda\right)\) for \(\theta = (\mu,\kappa,\Lambda)\).

For any positive integer \(J\) and a mixing probability \(\boldsymbol{\pi} = \left(\pi_1,\cdots,\pi_J\right)\), consider a \(J\)-mixture model: \[\begin{aligned} \label{eq:8} p\left(u;\boldsymbol{\pi},\boldsymbol{\theta}\right) = \sum_{j=1}^J\pi_jf\left(u;\theta_j\right) \end{aligned} \tag{4}\] where \(\boldsymbol{\theta} = \left(\theta_1,\cdots,\theta_J\right)\), \(\theta_j = (\mu_{j}, \kappa_{j}, \Lambda_j)\) for \(j=1,\cdots,J\). Let \(\left(\hat{\boldsymbol{\pi}}, \hat{\boldsymbol{\theta}}\right)\) be appropriate estimators of \(\left(\boldsymbol{\pi}, \boldsymbol{\theta}\right)\) based on \(\mathbb{X}_1\). The plug-in density estimate based on (4) is then \[\begin{aligned} \label{eq:9} p\left(\cdot;\hat{\boldsymbol{\pi}},\hat{\boldsymbol{\theta}}\right) = \sum_{j=1}^J\hat{\pi}_jf\left(\cdot;\hat{\theta}_j\right), \end{aligned} \tag{5}\] which can be used as a conformity score by setting \(g\left(\cdot,\mathbb{X}_1\right)=\hat{p}\left(\cdot\right)\). Assuming high concentrations, an alternative conformity score can be set as \(g\left(\cdot,\mathbb{X}_1\right)=p^{max}\left(\cdot,\hat{\boldsymbol{\pi}},\hat{\boldsymbol{\theta}}\right)\) where \[\begin{aligned} \label{eq:10} p^{max}\left(u;\hat{\boldsymbol{\pi}},\hat{\boldsymbol{\theta}}\right):=\max_{j=1,\cdots,J}\left(\hat{\pi}_jf\left(u;\hat{\theta}_j\right)\right)\approx p\left(u;\hat{\boldsymbol{\pi}},\hat{\boldsymbol{\theta}}\right). \end{aligned} \tag{6}\]

On the other hand, Mardia et al. (2012) introduced an approximated density function \(f^*\) for the \(p\)-variate von Mises sine distribution (3) for sufficiently high concentrations and when \(\Sigma\succ0\): \[\begin{aligned} \notag f^*\left(y;,\mu,\Sigma\right) = \left(2\pi\right)^{-p/2}\left|\Sigma\right|^{-1/2}\exp\left\{-\frac{1}{2}\left[\kappa^T\left(2 - 2c\left(y,\mu\right)\right)+s\left(y,\mu\right)^T\Lambda s\left(y,\mu\right)\right]\right\} \end{aligned}\] where \(\left(\Sigma^{-1}\right)_{jl} = \lambda_{jl}, \left(\Sigma^{-1}\right)_{jj} = \kappa_j, ~j\ne l\). By further approximating via \(\theta \approx \sin\theta\), \(1-\frac{\theta^2}{2}\approx\cos\theta\), we write \[\begin{aligned} f^*\left(y;,\mu,\Sigma\right) \approx \left(2\pi\right)^{-p/2}\left|\Sigma\right|^{-1/2}\exp\left\{-\frac{1}{2}\left[\left(y\ominus\mu\right)^T\Sigma^{-1}\left(y\ominus\mu\right)\right]\right\},\label{eq:11} \end{aligned} \tag{7}\] where the angular subtraction \(\ominus\) stands for \[\begin{aligned} \notag X\ominus Y := \left(\arg\left(e^{i\left(\phi_{x1} - \phi_{y1}\right)}\right), \cdots, \arg\left(e^{i\left(\phi_{xp} - \phi_{yp}\right)}\right)\right)^T, \end{aligned}\] for \(X = \left(\phi_{x1},\cdots,\phi_{xp}\right)^T\in\mathbb{T}^p\) and \(Y = \left(\phi_{y1},\cdots,\phi_{yp}\right)^T\in\mathbb{T}^p\) as defined in Jung et al. (2021) for \(p = 2\). By replacing the von Mises density \(f\) in (6) with the approximate normal density (7), \(\log\left(p^{max}\left(\cdot;\boldsymbol{\pi},\boldsymbol{\theta}\right)\right)\) is approximated by \[\begin{aligned} \log\left(p^{max}\left(u;\boldsymbol{\pi},\boldsymbol{\theta}\right)\right) &\approx \frac{1}{2}\max_j e\left(u;\pi_j,\theta_j\right) + c, \nonumber\\ e\left(u;\pi_j,\theta_j\right)&=-\left(u\ominus\mu_j\right)^T\Sigma_j^{-1}\left(u\ominus\mu_j\right) +2\log\pi_j-\log\left|\Sigma_j\right| \label{eq:ehatj} \end{aligned} \tag{8}\] where \(\theta_j=(\mu_{j},\Sigma_j)\), \(\mu_j = (\mu_{1j},\cdots,\mu_{pj})^T\in\mathbb{T}^p\), \(\Sigma_j\in\mathbb{R}^{p\times p}\) and a constant \(c\in\mathbb{R}\). Our last choice of the conformity score is \[\begin{aligned} \label{eq:12} g\left(\cdot,\mathbb{X}_{1}\right)=\max_je\left(\cdot,\hat{\pi}_j,\hat{\theta}_j\right). \end{aligned} \tag{9}\] Note that with this choice of conformity score, the conformal prediction set can be expressed as the union of ellipsoids on the torus. That is, the following equalities are satisfied (Shin et al. 2019; Jung et al. 2021): Let \(C_n^e\) be the level \(1-\alpha\) prediction set using (9). Then \[\begin{aligned} \notag C^e_n&:=\left\{x\in\mathbb{T}^p:g\left(x,\mathbb{X}_{1}\right)\ge g\left(X_{\left(i_{n_2,\alpha}\right)},\mathbb{X}_{1}\right)\right\}\\\label{eq:14} &=\bigcup_{j=1}^J\hat{E}_j\left(\sigma_{\left(i_{n_2,\alpha}\right)}\right) \end{aligned} \tag{10}\] where \(\hat{E}_j\left(t\right)=\left\{x\in\mathbb{T}^p:\left(x\ominus\hat{\mu}_j\right)^T\hat{\Sigma}_j^{-1}\left(x\ominus\hat{\mu}_j\right)\le 2\log\hat{\pi}_j-\log\left|\hat{\Sigma}_j\right| - t\right\}\) for \(t\in\mathbb{R}\). Note that \(\hat{E}_j\left(t\right)\) is automatically vanished if \(t\ge 2\log\hat{\pi}_j-\log\left|\hat{\Sigma}_j\right|\).

We have implemented four conformity scores, described in the previous section. These are based on

kernel density estimate (2),

mixture model (5),

max-mixture model (6), and

ellipsoids obtained by approximating the max-mixture (9).

The function `icp.torus`

in *ClusTorus* computes these conformity scores
using the inductive conformal prediction framework, and returns
`icp.torus`

object(s). Table 1 illustrates
several important arguments of the function `icp.torus`

.

Arguments | Descriptions |
---|---|

`data` |
\(n \times d\) matrix of toroidal data on \([0, 2\pi)^d\) or \([-\pi, \pi)^d\) |

`model` |
A string. One of "kde", "mixture", and "kmeans" which determines the model or estimation methods. If "kde", the model is based on the kernel density estimates. It supports the kde-based conformity score only. If "mixture", the model is based on the von Mises mixture, fitted with an EM algorithm. It supports the von Mises mixture and its variants based conformity scores. If "kmeans", the model is also based on the von Mises mixture, but the parameter estimation is implemented with the elliptical k-means algorithm illustrated in Appendix. It supports the log-max-mixture based conformity score only. If the dimension of data space is greater than 2, only "kmeans" is supported. Default is `model = "kmeans"` . |

`J` |
A scalar or numeric vector for the number(s) of components for `model = c("mixture", "kmeans")` . Default is `J = 4` . |

`concentration` |
A scalar or numeric vector for the concentration parameter(s) for `model = "kde"` . Default is `concentration = 25` . |

The argument `model`

of the function `icp.torus`

indicates which
conformity score is used. By setting `model = "kde"`

, the kde-based
conformity score (2) is used. By setting `model = "mixture"`

the
mixture model (5) is estimated by an EM algorithm, and
conformity scores based on (5), (6), (9) are
all provided. Setting `model = "kmeans"`

provides a mixture model fit by
the elliptical \(k\)-means algorithm and conformity score based on
(9).

The arguments `J`

and `concentration`

specify the model fitting
hyperparameters. To compute conformity scores based on kernel density
estimate (2), one needs to specify the concentration parameter
\(\kappa\). Likewise, the number of mixtures, \(J\), needs to be specified
in order to fit the mixture model (5) and the variants
(6) and (9). The function `icp.torus`

takes either a
single value (e.g., `J = 4`

is the default), or a vector (e.g.,
`J = 4:30`

or `concentration = c(25,50)`

) for arguments `J`

and
`concentration`

. If `J`

(or `concentration`

) is a scalar, then
`icp.torus`

returns an `icp.torus`

object.

On the other hand, if `J`

(or `concentration`

) is a numeric vector
containing at least two values, then `icp.torus`

returns a list of
`icp.torus`

objects, one for each value in `J`

(or `concentration`

,
respectively). Typically, the hyperparameter \(J\) (or \(\kappa\)) is not
predetermined, and one needs to choose among a set of candidates. A list
of `icp.torus`

objects, evaluated for each candidate in vector-valued
`J`

(or `concentration`

) is required for our hyperparameter selection
procedure, discussed in a later section.

Let us present an R code example for creating an `icp.torus`

object,
fitted with `model = "kmeans"`

(the default value for argument `model`

)
and `J = 12`

.

```
set.seed(2021)
.12 <- icp.torus(SARS_CoV_2, J = 12)
icp.torusplot(icp.torus.12, level = 0.1111)
```