Common Spatial Patterns (CSP) is a widely used method to analyse electroencephalography (EEG) data, concerning the supervised classification of the activity of brain. More generally, it can be useful to distinguish between multivariate signals recorded during a time span for two different classes. CSP is based on the simultaneous diagonalization of the average covariance matrices of signals from both classes and it allows the data to be projected into a low-dimensional subspace. Once the data are represented in a low-dimensional subspace, a classification step must be carried out. The original CSP method is based on the Euclidean distance between signals, and here we extend it so that it can be applied on any appropriate distance for data at hand. Both the classical CSP and the new Distance-Based CSP (DB-CSP) are implemented in an R package, called dbcsp.
Eigenvalue and generalized eigenvalue problems are very relevant techniques in data analysis. The well-known Principal Component Analysis with the eigenvalue problem in its roots was already established by the late seventies (Mardia et al. 1979). In mathematical terms, Common Spatial Patterns (CSP) is based on the generalized eigenvalue decomposition or the simultaneous diagonalization of two matrices to find projections in a low dimensional space. Although in algebraic terms PCA and CSP share several similarities, their main aims are different: PCA follows a non-supervised approach but CSP is a two-class supervised technique. Besides, PCA is suitable for standard quantitative data arranged in \(`individuals \times variables'\) tables, while CSP is designed to handle multivariate signals time series. That means that, while for PCA each individual or unit is represented by a classical numerical vector, for CSP each individual is represented by several signals recorded during a time span, i.e., by a \(`number of signals\times time span'\) matrix. CSP allows the individuals to be represented in a dimension reduced space, a crucial step given the high dimensional nature of the original data. CSP computes the average covariance matrices of signals from the two classes to yield features whose variances are optimal to discriminate the classes of measurements. Once data is projected into a low dimensional space, a classification step is carried out. The CSP technique was first proposed under the name Fukunaga-Koontz Transform in (Fukunaga and Koontz 1970) as an extension of PCA, and (Müller-Gerking et al. 1999) used it to discriminate electroencephalography data (EEG) in a movement task. Since then, it has been a widely used technique to analyze EEG data and develop Brain Computer Interfaces (BCI), with different variations and extensions (Blankertz et al. 2007a,b; Grosse-Wentrup and Buss 2008; Lotte and Guan 2011; Wang et al. 2012; Astigarraga et al. 2016; Darvish Ghanbar 2021). In (Wu et al. 2013), subject specific best time window and number of CSP features are fitted through a two-level cross validation scheme within the Linear Discriminant classifier. (Samek et al. 2014) offer a divergence-based framework including several extensions of CSP. As a general term, CSP filter maximizes the variance of the filtered or projected EEG signals of one class of movements while minimizing it for the signals of the other class. Similarly, it can be used to detect epileptic activities (Khalid et al. 2016) or other brain activities. BCI systems can also be of great help to people who suffer from some disorders of cerebral palsy, or who suffer from other diseases or disabilities that prevent the normal use of their motor skills. These systems can considerably improve the quality of life of these people, for which small advances and changes imply big improvements. BCI systems can also contribute to human vigilance detection, connected with occupations involving sustained attention tasks. Among others, CSP and variations of it have been applied to the vigilance estimation task (Yu et al. 2019).
The original CSP method is based on the Euclidean distance between signals. However, as far as we know, a generalization allowing the use of any appropriate distance was not introduced. The aim of the present work is to introduce a novel Distance-Based generalization of it (DB-CSP). This generalization is of great interest, since these techniques can also offer good solutions in other fields where multivariate time series data arise beyond pure electroencephalography data (Poppe 2010; Rodríguez-Moreno et al. 2020).
Although CSP in its classical version is a very well-known technique in the field of BCI, it is not implemented in R. In addition, as DB-CSP is a new extension of it, it is worth building an R package that includes both CSP and DB-CSP techniques. The package offers functions in a user-friendly way for the less familiar users of R but it also offers complete information about its objects so that reproducible analysis can be carried out and more advanced and customised analysis can be performed taking advantage of already well-known packages of R.
The paper is organized as follows. First, we review the mathematical formulation of the Common Spatial Patterns method. Next, we present the core of our contribution describing both the novel CSP’ extension based on distances and the dbcsp package. Then, the main functions in dbcsp are introduced along with reproducible examples of their use. Finally, some conclusions are drawn.
Let us consider that we have \(n\) statistical individuals or units classified in two classes \(C_1\) and \(C_2\), with \(\#C_1 = n_1\) and \(\#C_2 = n_2\). For each unit \(i\) in class \(C_k\), data from \(c\) sources or signals are collected during \(T\) time units and therefore unit \(i\) is represented in matrix the \(X_{ik}\) (\(i = 1, \ldots, n_k\, ; \; k=1, 2\)). For instance, for electroencephalograms, data are recorded by a \(c\)-sensor cap each \(t\) time units (\(t=1, \ldots, T\)). As usual, we consider that each \(X_{ik}\) is already scaled or with the appropriate pre-processing in the context of application; for instance, if working with EGG data, each signal should be band-pass filtered before its use.
The goal is to classify a new unit \(X\) in \(C_1\) or \(C_2\). To this end, first a projection into a low-dimensional subspace is carried out. Then, as a standard approach the Linear Discriminant classifier (LDA) is applied taking as input data for the classifier the log-variance of the projections obtained in the first step. It is obvious that the importance of the technique lies mainly in the first step, and once it is done, LDA or any other classifiers could be applied. Based on that, we focus on how this projection into a low-dimensional space is done, from the classical CSP point of view as well as its novel extension DB-CSP (see Figure 1).
The main idea is to use a linear transform to project or filter data into low-dimensional subspace with a projection matrix, in such a way that each row consists of weights for signals. This transformation maximizes the variance of two-class signal matrices. The method performs a simultaneous diagonalization of the covariance matrices of both classes. Given data \(X_{11}, \ldots, X_{n_1 1}\) (matrices \(c \times T\)) from class \(C_1\) and \(X_{12}, \ldots, X_{n_2 2}\) (also matrices \(c \times T\)) from class \(C_2\), the following steps are needed:
All matrices are standardized so that traces of \(X_{i k}X_{i k}'\) are the same.
Compute average covariance matrices: \[B_k = \frac{1}{n_k}\sum_{i=1}^{n_k}X_{i k} X_{i k}' \, , \quad k=1,2\]
Look for directions \(W= (\mathbf{w}_1, \ldots, \mathbf{w}_c) \in \mathbb{R}^{c \times c}\) according to the criterion:
\[\begin{aligned} Maximize\; & tr(W'B_1W)\\ subject to\; & W'(B_1 + B_2)W = I\\ \end{aligned}\]
The solution is given by the generalized spectral decomposition \(B_1\mathbf{w} = \lambda B_2 \mathbf{w}\) choosing the first and the last \(q\) eigenvectors: \(W_{CSP}= (\mathbf{w}_1, \ldots, \mathbf{w}_q, \mathbf{w}_{c-q+1}, \ldots, \mathbf{w}_c)\).
Vectors \(\mathbf{w}_j\) offer weights so that new signals \(X_{i 1}'\mathbf{w}_j\) and \(X_{i 2}'\mathbf{w}_j\) have big and low variability for the first \(q\) vectors (\(j=1, \ldots, q\)) respectively, and vice-versa for the last \(q\) vectors (\(j=c-q+1, \ldots, c\)). To clarify the notation and interpretation, let us denote \(\mathbf{a}_j=\mathbf{w}_j\) the first \(q\) vectors and \(\mathbf{b}_j=\mathbf{w}_{c+1-j}\) the last \(q\). That way, and broadly speaking, variability of elements in \(C_1\) is big when projecting on vectors \(\mathbf{a}_j\) and low on vectors \(\mathbf{b}_j\), and vice-versa, for elements in class \(C_2\).
Finally, the log-variability of these new and few \(2q\) signals are considered as input for the classification, which classically is the Linear Discriminant Analysis (LDA). Obviously, any other classification technique can be used, as it is illustrated in the subsection Extending the example.
Following the commented ideas, the Distance-Based CSP (DB-CSP) is an extension of the classical CSP method. In the same way as the classical CSP, DB-CSP gives some weights to the original sources or signals and obtains new and few \(2q\) signals which are useful for the discrimination between the two classes. Nevertheless, the considered distance between the signals can be any other than the Euclidean. The steps are the following:
Once we have the covariance matrices related to the chosen distance matrix, the directions are found as in classical CSP and new signals \(X_{i k}'\mathbf{a}_j\), \(X_{i k}'\mathbf{b}_j\) are built (\(j=1, \ldots, q\)). Again, for individuals in class \(C_1\) the projections on vectors \(\mathbf{a}\) and \(\mathbf{b}\) are big and low respectively; for individuals in class \(C_2\) it is the other way round.
It is important to note that if the chosen distance does not produce a positive definite covariance matrix, it must be replaced by a similar one that is positive definite.
When the selected distance is the Euclidean, then, DB-CSP reduces to classical CSP.
Once the \(q\) directions \(\mathbf{a}_j\) and \(\mathbf{b}_j\) are calculated, new \(2q\) signals are built. Many interesting characteristics of the new signals could be extracted, although the most important in the procedure is the variance. Those characteristics of the new signals are the input data for the classification step.
In this section, the structure of the package and the functions implemented are explained. The dbcsp package was developed for the free statistical R environment and it is available from the Comprehensive R Archive Network (CRAN) at https://cran.r-project.org/web/packages/dbcsp/index.html.
The input data are the corresponding \(n_1\) and \(n_2\) matrices \(X_{i k}\)
of the \(n\) units classified in classes \(C_1\) and \(C_2\), respectively
(\(i=1, \ldots, n_k\) ; \(k=1, 2\)). Let x1
and x2
be two lists of
length \(n_1\) and \(n_2\), respectively, with \(X_{i k}\) matrices
(\(c \times T\)) as elements of the lists. NA
values are allowed. They
are imputed by interpolating with the surrounding values via the
na.approx
function in package
zoo. To ensure the user is
aware of the missing values and their imputation, a warning is printed.
We also consider that new items to be classified are in list xt
. The
aforementioned first step of the method is carried out by building the
object called "dbcsp"
.
dbcsp
objectThe dbcsp
object is an S4 class created to compute the projection
vectors \(W\). The object has the following slots:
Slots
X1 = "list"
, X2 = "list"
, the lists X1
and X2
(lengths \(n_1\)
and \(n_2\)) containing the matrices \(X_{i k}\) for the two classes
\(C_1\) and \(C_2\), respectively (\(i=1, \ldots, n_k \, ; \; k=1,2\)).
q = "integer"
, to determine the number of pairs of eigenvectors
\(\mathbf{a}_j\) and \(\mathbf{b}_j\) that are kept. By default q=15
.
labels = "character"
, vector of two strings indicating labels
names, by default names of elements in X1
and X2
.
type = "character"
, to set the type of distance to be considered,
by default type=’EUCL’
. The supported distances are these ones:
infnorm, ccor, sts,...
bhattacharyya, bray,...
dcustom
which returns a scalar
providing the distance value
(\(d(\mathbf{x}_{ik},\mathbf{x}_{jk})\)) between signals
\(\mathbf{x}_{ik}\) and \(\mathbf{x}_{jk}\)
(\(i, j = 1, \ldots, n_k \, , \; k=1, 2\)). The name of the
custom distance function is passed as character to the type
parameter (type="dcustom"
). The parallelDist package also
allows the use of custom distances, but the distance function
has to be defined using the cppXPtr
function of the
RcppXPtrUtils
package, as is explained in the User-defined distance
functions section of the parallelDist package
documentation.mixture = "logical"
, logical value indicating whether to use
mixture of distances or not (EUCL + other), by default
mixture=FALSE
.
w = "numeric"
, weight for the mixture of distances
\(D_{mixture} = w D_{euclidea} + (1-w) D_{type}\), by default
w=0.5
.
training = "logical"
, logical value indicating whether or not to
perform the classification, by default classification=FALSE
. If
classification=TRUE
, LDA discrimination based on the log-variances
of the projected sources is considered, following the classical
approach in CSP.
fold = "integer"
, integer value, by default fold=10
. It controls
the number of partitions for the \(k\)-fold validation procedure, if
the classification is done.
seed = "numeric"
, numeric value, by default seed=NULL
. Set a
seed in case you want to be able to replicate the results.
eig.tol = "numeric"
, numeric value, by default eig.tol=1e-06
. If
the minimum eigenvalue is below this tolerance, average covariance
matrices are replaced by the most similar matrix that is positive
definite. It is done via function nearPD
in
Matrix and a warning
message is printed to make the user aware of it.
out = "list"
, list containing elements of the output. Mainly,
matrix \(W\) with vectors \(\mathbf{a}_j\) and \(\mathbf{b}_j\) in element
vectors
, log-variances of filtered signals in proy
and
partitions considered in the \(k\)-fold approach with reproducibility
purposes.
Usage
Following the standard procedure in R
, an instance of a class
dbcsp
is created via the new()
constructor function:
new("dbcsp", X1 = x1, X1 = x2)
Slots X1
and X2
are compulsory since they contain the original
data. When a slot is not specified, the default value is considered.
First, the S4 object of class dbcsp
must be created. By default,
the Euclidean distance is used, nevertheless it can be changed. For
instance, "Dynamic Transform Distance" (Giorgino et al. 2009) can be set:
<- new('dbcsp', X1=x1, X2=x2, type='dtw')
mydbcsp ```
or a mixture between this distance and the Euclidean can be
indicated by:
``` r
<- new('dbcsp', X1=x1, X2=x2, labels=c("C1", "C2"),
mydbcsp.mix mixture=TRUE, w=0.4,type="dtw")
Besides, a custom distance function can be defined and used when creating the object:
<- function(x, y) mean(1 - cos(x - y))
fn <- new("dbcsp", X1 = x, X2 = y, type="fn") mydbcsp
It is worth mentioning that it is possible to reduce the
computational time through parallelDist
custom distance option,
where the function is defined using C++ and by creating an external
pointer to the function by means of the cppXPtr
function:
<- RcppXPtrUtils::cppXPtr(
customEucli "double customDist(const arma::mat &A, const arma::mat &B) {
return sqrt(arma::accu(arma::square(A - B)));
}",
depends = c("RcppArmadillo")
)<- new('dbcsp',x1,x2,type="customEucli") mydbcsp
The object contains all the information to carry out the classification task in a lower dimension space.
plot
and boxplot
For exploratory and descriptive purposes, the original signals \(X_{i k}\) and the projected ones can be plotted for the selected individual \(i\) in class \(k\), and the selected pair of dimensions \(\mathbf{a}_j\) and \(\mathbf{b}_j\) (\(i= 1, \ldots, n_k\), \(k=1,2\)).
plot(mydbcsp)
x
, an object of class dbcsp
class
, integer to indicate which of both classes to access (1 or
2), by default class=1
.index
, integer to indicate which instance of the class to plot,
by default index=1
.vectors
, integer to indicate which \(j\) projected signals are to be
plotted. By default all the vectors used in the projection are
plotted.pairs
logical, if TRUE
the pairs \(\mathbf{a}_j\) and
\(\mathbf{b}_j\) of the indicated indices are also shown, by default
pairs=TRUE
.before
logical, if TRUE
the original signals are plotted, by
default before=TRUE
.after
logical, if TRUE
the signals after projection are
plotted, by default after=TRUE
.legend
logical, if TRUE
, a legend for filtered signals is
shown, by default legend=FALSE
.getsignals
logical, if TRUE
, the projected signals are returned.Besides, the log-variances of the projected signals of both classes can be shown in boxplots. This graphic can help to understand the discriminative power that is in the low-dimension space.
boxplot(mydbcsp)
x
, an object of class dbcsp
vectors
, integer or vector of integers, indicating the index of
the projected vectors to plot, by default index=1
.pairs
logical, if TRUE
the pairs \(\mathbf{a}_j\) and
\(\mathbf{b}_j\) of the indicated indices are also shown, by default
pairs=TRUE
.show_log
logical, if TRUE
the logarithms of the variances are
displayed, otherwise the variances, by default show_log=TRUE
.It is worth taking into account that in the aforementioned functions,
values in argument vectors
must lie between 1 and \(2q\), being \(q\) the
number of dimensions used to perform the DB-CSP algorithm when creating
the dbcsp
object. Therefore, values 1 to \(q\) correspond to vectors
\(\mathbf{a}_1\) to \(\mathbf{a}_q\) and values \(q+1\) to \(2q\) correspond to
vectors \(\mathbf{b}_1\) to \(\mathbf{b}_q\). Then, if pairs=TRUE
, it is
recommended that values in argument vectors
are in \(\{1, \ldots, q\}\),
since their pairs are plotted as well. When values are above \(q\), it
should be noted that they correspond to vectors \(\mathbf{b}_1\) to
\(\mathbf{b}_q\). For instance, if q=15
and
boxplot(object, vectors=16, pairs=FALSE)
, vector \(\mathbf{b}_1\)
\((16-q=1)\) is shown.
selectQ
, Function train
and Function predict
The functions in this section help the classification step in the
procedure. Function selectQ
helps to find an appropriate dimension
needed for the classification. Given different values of dimensions, the
accuracy related to each dimension is calculated so that the user can
assess which dimension of the reduced space can be sufficient. A
\(k\)-fold cross-validation approach or a holdout approach can be
followed. Function train
performs the Linear Discriminant
classification based on the log-variances of the dimensions built in the
dbcsp
object. Since LDA has a geometric interpretation that makes the
classifier sensible for more general situations (Duda et al. 2001), not the
normality nor the homoscedasticity of data are checked. The accuracy of
the classifier is computed based on the \(k\)-fold validation procedure.
Finally, function predict
performs the classification of new
individuals.
selectQ
selectQ(mydbcsp)
object
, an object of class dbcsp
Q
, vector of integers which represents the dimensions to use, by
default Q=c(1,2,3,5,10,15)
.train_size
, float between 0.0 and 1.0 representing the proportion
of the data set to include in the train split, by default
train_size=0.75
.CV
, logical indicating whether a \(k\)-fold cross validation must be
performed or a hold-out approach (if TRUE
, train_size
is not
used), by default CV=FALSE
.folds
integer, number of folds to use if CV
is performed.seed
numeric value, by default seed=NULL
. Set a seed in case you
want to be able to replicate the results.This function returns the accuracy values related to each dimension set
in Q
. If CV=TRUE
, the mean accuracy as well as the standard
deviation among folds is also returned.
Usage of train
train(mydbcsp)
or embedded as a parameter in:
new(’dbcsp’, X1=x1, X2=x2, training=TRUE, type="dtw")
Arguments
x
, an object of class dbcsp
selected_q
, integer value indicating the number of vectors to use
when training the model. By default all dimensions considered when
creating the object dbcsp
.
Besides, arguments seed
and fold
are available.
It is important to note that in this way a classical analysis can be carried out, in the sense of:
select_q
;However, it is evident that it may be of interest to use other
classifiers or other characteristics in addition to or different from
log-variances. This more advanced procedure is explained below. See the
basic analysis of the User guide with a real example section in
order to visualize and follow the process of a first basic/classic
analysis.
predict
predict(mydbcsp, X_test=xt)
object
, an object of class dbcsp
X_test
, list of matrices to be classified.true_targets
, optional, if available, vector of true labels of the
instances. Note that they must match the name of the labels used
when training the model.To show an example beyond pure electroencephalography data, Action Recognition data is considered. Besides having a reproducible example to show the use of the implemented functions and the results they offer, this Action Recognition data set is included in the package. The data set contains the skeleton data extracted from videos of people performing six different actions, recorded by a semi-humanoid robot. It consists of a total of 272 videos with 6 action categories. There are around 45 clips in each category, performed by 46 different people. Each instance is composed of 50 signals (\(xy\) coordinates for 25 body key points extracted using OpenPose (Cao et al. 2019)), where each signal has 92 values, one per frame. These are the six categories included in the data set:
Come: gesture for telling the robot to come to you. There are 46 instances for this class.
Five: gesture of ‘high five’. There are 45 instances for this class.
Handshake: gesture of handshaking with the robot. There are 45 instances for this class.
Hello: gesture for saying hello to the robot. There are 44 instances for this class.
Ignore: ignore the robot, pass by. There are 46 instances for this class.
Look at: stare at the robot in front of it. There are 46 instances for this class.
The data set is accessible via AR.data
and more specific information
can be found in (Rodrı́guez-Moreno et al. 2020). Each class is a list of
matrices of \([K \times num\_frames]\) dimensions, where \(K=50\) signals
and \(num\_frames=92\) values. As mentioned before, the 50 signals
represent the \(xy\) coordinates of 25 body key points extracted by
OpenPose.
For example, two different classes can be accessed this way:
<- AR.data$come
x1 <- AR.data$five x2
where, x1
is a list of 46 instances of \([50x92]\) matrices of come
class and x2
is a list of 45 instances of \([50x92]\) matrices of five
class. An example of skeleton sequences for both classes is shown in
Figure 2 (left, for class come and right, for class
five).
Next, the use of functions in dbcsp
is shown based on this data set.
First a basic/classic analysis is performed.
Let us consider an analysis using 15-dimensional projections and the Euclidean distance. At a first step the user can obtain vectors \(W\) by:
<- AR.data$come
x1 <- AR.data$five
x2 <- new('dbcsp', X1=x1, X2=x2, q=15, labels=c("C1", "C2"))
mydbcsp summary(mydbcsp)
Creating the object mydbcsp
, the vectors \(W\) are calculated. As
indicated in parameter q=15
, the first and last 15 eigenvectors are
retained. With summary
, the obtained output is:
46 instances of class C1 with [50x92] dimension.
There are 45 instances of class C2 with [50x92] dimension.
There are -CSP method has used 15 vectors for the projection.
The DB
EUCL distance has been used. Training has not been performed yet.
Now, if the user knows from the beginning that 3 is an appropriate dimension, the classification step could be done while creating the object. Using classical analysis, with for instance 10-fold, LDA as classifier and log-variances as characteristics, the corresponding input and summary output are:
<- new('dbcsp', X1=x1, X2=x2, q=3, labels=c("C1", "C2"), training=TRUE, fold = 10, seed = 19)
mydbcsp summary(mydbcsp)
46 instances of class C1 with [50x92] dimension.
There are 45 instances of class C2 with [50x92] dimension.
There are -CSP method has used 3 vectors for the projection.
The DB
EUCL distance has been used.0.9130556 has been obtained with 10 fold cross validation and using 3 vectors when training. An accuracy of
If a closer view of the accuracies among the folds is needed, the user
can obtain them from the out
slot of the object:
# Accuracy in each fold
@out$folds_acc
mydbcsp
# Intances belonging to each fold
@out$used_folds mydbcsp
Furthermore, it is clear that the optimal value of \(q\) should be chosen
based on the percentages of correct classification. It is worth
mentioning that the LDA is applied on the \(2q\) projections, as set in
the object building step. It is interesting to measure how many
dimensions would be enough using selectQ
function:
<- new('dbcsp', X1=x1, X2=x2, labels=c("C1", "C2"))
mydbcsp <- selectQ(mydbcsp, seed=30, CV=TRUE, fold = 10) selectDim
selectDim
Q acc sd1 1 0.7663889 0.12607868
2 2 0.9033333 0.09428818
3 3 0.8686111 0.11314534
4 5 0.8750000 0.13289537
5 10 0.8797222 0.09513230
6 15 0.8250000 0.05257433
Since the \(10\)-fold cross-validation approach is chosen, the mean accuracies as well as the corresponding standard deviations are returned. Thus, with Linear Discriminant Analysis (LDA), log-variances as characteristics, it seems that dimensions related to first and last \(q=2\) eigenvectors (\(2\times 2\) dimensions in total) are enough to obtain a good classification, with an accuracy of 90%. Nevertheless, it can also be observed that variation among folds can be relevant.
To visualize what is the representation in the reduced dimension space
function plot
can be used. For instance, to visualize the first unit
of the first class, based on projections along the 2 first and last
vectors (\(\mathbf{a}_1, \mathbf{a}_2\) and
\(\mathbf{b}_1, \mathbf{b}_2\)):
plot(mydbcsp, index=1, class=1, vectors=1:2)
In the top graphic of Figure 3, the representation of the first video of class \(C_1\) given by non standardized matrix \(X_{11}\) can be seen, where the horizontal axis represents the frames of the video and the lines are the positions of the body key points (50 lines). In the bottom graphic, the same video is represented in a reduced space where the video is represented by the new signals (only 4 lines).
To have a better insight of the discriminating power of the new signals
in the reduced dimension space, we can plot the corresponding
log-variances of the new signals. Parameter vectors
in function
boxplot
sets which are the eigenvectors considered to plot.
boxplot(mydbcsp, vectors=1:2)
In Figure 4 it can be seen that variability of
projections on the first eigenvector direction
(\(\log(VAR(X_{i k}'\mathbf{a}_1))\)) are big for elements in x1
, but
small for elements in x2
. Analogously, projecting on the last
dimension (\(\log(VAR(X_{i k}'\mathbf{b}_1))\)), low variability is held
in x1
and big variability in x2
. The same pattern holds when
projecting on vectors \(\mathbf{a}_2\) and \(\mathbf{b}_2\).
Once the value of \(q\) has been decided and the accuracy of the
classification is known, the classifier should be built (through
train()
) so that the user can proceed to predict the class a new
action held in a video belongs to, using the function predict
. For
instance, with only illustrative purpose, we can classify the first 5
videos which are stored in x1
.
<- train(mydbcsp, selected_q=2, verbose=FALSE)
mydbcsp <- x1[1:5]
xtest <- predict(mydbcsp, X_test=xtest) outpred
If the labels of the testing items are known, the latter function returns the accuracy.
<- predict(mydbcsp, X_test=xtest, true_targets= rep("C1", 5)) outpred
Finally, notice that the user could use any other distance instead of
the Euclidean between the signals to compute the important directions
\(\mathbf{a}_j\) and \(\mathbf{b}_j\). For instance, in this case it could
be appropriate to use the Dynamic Time Warping distance, setting so in
the argument type="dtw"
:
# Distance DTW
<- new('dbcsp', X1=x1, X2=x2, labels=c("C1", "C2"), type="dtw") mydbcsp.dtw
In the previous section a basic workflow to use functions implemented in
dbcsp is presented. Nevertheless, it is straightforward to extend the
procedure. Once the interesting directions in \(W\) are calculated through
dbcsp
, other summarizing characteristics beyond the variance could be
extracted from the projected signals, as well as other classifiers which
could be used in the classification step. For those purposes, dbcsp
is
used to compute the directions in \(W\) that will be the base to calculate
other features as well as the input features for other classifiers. Here
it is shown how, once the eigenvectors are extracted from an object
dbcsp
, several characteristics could be extracted from the signals and
a new data.frame
can be built so that any other classification
technique could be applied. In this example we worked with
caret package to apply
different classifiers. It is important to pay attention to which the
train and test sets are, so that the vectors are computed based only on
training set instances.
# Establish training and test data
<- length(x1)
n1 <- rep(TRUE, n1)
trainind1 <- length(x2)
n2 <- rep(TRUE, n2)
trainind2 set.seed(19)
sample(1:n1, 10, replace=FALSE)] <- FALSE
trainind1[sample(1:n2, 10, replace=FALSE)] <- FALSE
trainind2[<- x1[trainind1]
x1train <- x2[trainind2]
x2train
# Extract the interesting directions
<- new('dbcsp', X1=x1train, X2=x2train, q=5, labels=c("C1", "C2"))@out$vectors
vectors
# Function to calculate the desired characteristics from signals
<- function(proj_X, type){
calc_info <- switch(type,
values 'var' = values <- plyr::laply(proj_X, function(x){apply(x,1,var)}),
'max' = values <- plyr::laply(proj_X, function(x){apply(x,1,max)}),
'min' = values <- plyr::laply(proj_X, function(x){apply(x,1,min)}),
'iqr' = values <- plyr::laply(proj_X, function(x){
apply(x,1,function(y){
<- quantile(y, probs = c(0.25, 0.75))
q 2] -q[1]
q[
})
})
)return(values)
}
By means of this latter function, besides the variance of the new signals, the maximum, the minimum, and the interquartile range can be extracted.
Next, imagine we want to perform our classification step with the interquartile range information along with the log-variance.
# Project units of class C1 and
<- plyr::llply(x1, function(x,W) t(W)%*%x, W=vectors)
projected_x1
# Extract the characteristics
<- log(calc_info(projected_x1,'var'))
logvar_x1 <- calc_info(projected_x1,'iqr')
iqr_x1 <- data.frame(logvar=logvar_x1, iqr=iqr_x1)
new_x1
# Similarly for units of class C2
<- plyr::llply(x2, function(x,W) t(W)%*%x, W=vectors)
projected_x2 <- log(calc_info(projected_x2,'var'))
logvar_x2 <- calc_info(projected_x2,'iqr')
iqr_x2 <- data.frame(logvar=logvar_x2, iqr=iqr_x2)
new_x2
# Create dataset for classification
<- rep(c('C1','C2'), times=c(n1,n2))
labels <- rbind(new_x1,new_x2)
new_data $label <- factor(labels)
new_data<- new_data[c(trainind1, trainind2), ]
new_data_train <- new_data[!c(trainind1, trainind2), ]
new_data_test
# Random forest
<- caret::trainControl(method = "none")
trControl <- caret::train(label~.,
rf_default data = new_data_train,
method = "rf",
metric = "Accuracy",
trControl = trControl)
rf_default
# K-NN
<- caret::train(label~.,
knn_default data = new_data_train,
method = "knn",
metric = "Accuracy",
trControl = trControl)
knn_default
# Predictions and accuracies on test data
# Based on random forest classifier
<- predict(rf_default, new_data_test)
pred_labels <- caret::confusionMatrix(table(pred_labels,new_data_test$label))
predictions_rf
predictions_rf
# Based on knn classifier
<- predict(knn_default, new_data_test)
pred_labels <- caret::confusionMatrix(table(pred_labels,new_data_test$label))
predictions_knn predictions_knn
Thus, it is easy to integrate results and objects that dbcsp builds so that they can be integrated with other R packages and functions. This is interesting for more advanced users to perform their own customized analysis.
In this work a new Distance-Based Common Spatial Pattern is introduced. It allows to perform the classical Common Spatial Pattern when the Euclidean distance between signals is considered, but it can be extended to the use of any other appropriate distance between signals as well. All of it is included in package the dbcsp. The package is easy to use for non-specialised users but, for the sake of flexibility, more advanced analysis can be carried out combining the created object and obtained results with already well-known R packages, such as caret, for instance.
This research was partially supported: IR by The Spanish Ministry of Science, Innovation and Universities (FPU18/04737 predoctoral grant). II by the Spanish Ministerio de Economia y Competitividad (RTI2018-093337-B-I00; PID2019-106942RB-C31). CA by the Spanish Ministerio de Economia y Competitividad (RTI2018-093337-B-I00, RTI2018-100968-B-I00) and by Grant 2017SGR622 (GRBIO) from the Departament d’Economia i Coneixement de la Generalitat de Catalunya. BS II by the Spanish Ministerio de Economia y Competitividad (RTI2018-093337-B-I00).
II and CA designed the study. IR and II wrote and debugged the software. IR, II and CA checked the software. II, CA, IR and BS wrote and reviewed the manuscript. All authors have read and approved the final manuscript.
zoo, TSdist, parallelDist, RcppXPtrUtils, Matrix, caret
Econometrics, Environmetrics, Finance, HighPerformanceComputing, MachineLearning, MissingData, NumericalMathematics, TimeSeries
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Rodríguez, et al., "dbcsp: User-friendly R package for Distance-Based Common Spatial Patterns", The R Journal, 2022
BibTeX citation
@article{RJ-2022-044, author = {Rodríguez, Itsaso and Irigoien, Itziar and Sierra, Basilio and Arenas, Concepción}, title = {dbcsp: User-friendly R package for Distance-Based Common Spatial Patterns}, journal = {The R Journal}, year = {2022}, note = {https://rjournal.github.io/}, volume = {14}, issue = {3}, issn = {2073-4859}, pages = {80-94} }