In this work we present the *wavScalogram* R package, which contains methods based on wavelet scalograms for time series analysis. These methods are related to two main wavelet tools: the windowed scalogram difference and the scale index. The windowed scalogram difference compares two time series, identifying if their scalograms follow similar patterns at different scales and times, and it is thus a useful complement to other comparison tools such as the squared wavelet coherence. On the other hand, the scale index provides a numerical estimation of the degree of non-periodicity of a time series and it is widely used in many scientific areas.

Since the works of Mallat (2008) and Daubechies (1992), wavelet analysis has become, in the last few decades, a standard tool in the field of time series analysis. Its ability to simultaneously analyze a signal in frequency space (scales) and in time, allows it to overcome many of the limitations that Fourier analysis presents for non-stationary time series. Furthermore, the algorithms for calculating the different wavelet transforms are characterized by their speed and ease of implementation.

There are currently many software packages that implement functions for
wavelet analysis of time series (MATLAB’s Wavelet Toolbox, Wavelab,
etc.), and in recent years, the exponential growth of the R ecosystem
has not been outside the field of wavelet analysis. Within CRAN there
are many packages related to wavelet analysis for time series.
Specifically, as collected in the
*TimeSeries* Task View,
the *wavelets* package
(Aldrich 2020) , the
*WaveletComp* and
*biwavelet* packages
(Roesch and Schmidbauer 2018; Gouhier et al. 2021) , the
*mvLSW* package (Taylor et al. 2019) and
other packages such as
*hwwntest* (Savchev and Nason 2018),
*rwt* (Roebuck and Rice University’s DSP group 2022),
*waveslim* (Whitcher 2020)
and *wavethresh*
(Nason 2022).

In this work we will describe in depth the
*wavScalogram*
package (Bolós and Benı́tez 2021) (also mentioned in the
*TimeSeries* Task View).
In this package, methods based on the wavelet scalogram are introduced
as defined in Benı́tez et al. (2010; Bolós et al. 2017, 2020). These methods are basically
related to two main wavelet tools: the *windowed scalogram difference*
and the *scale index*. The first one, the windowed scalogram difference,
was introduced in Bolós et al. (2017). It allows to compare two time series at
different scales and times, determining if their scalograms follow
similar patterns. In this sense, it is a complement to other wavelet
tools for comparing time series such as the squared wavelet coherence
and the phase difference, since there are certain differences in time
series that these measurements are not capable of detecting while the
windowed scalogram difference can. The second tool is the scale index
introduced in Benı́tez et al. (2010). It focuses on the analysis of the non-periodicity
of a signal, giving a numerical measure of its degree of
non-periodicity, taking the value \(0\) if the signal is periodic and a
value close to \(1\) if the signal is totally aperiodic (for example, a
purely stochastic signal). The scale index has been used in many
scientific areas, being the evaluation of the quality of pseudo-random
number generators the area where it has been used the most. In addition,
the scale index also has a *“windowed”* version, in which the windowed
scalogram is used to calculate the scale index instead, allowing to
measure the evolution of the scale index over time, which is useful in
the case of non-stationary time series (see Bolós et al. (2020)).

The article is organized as follows: In the next section, we describe
the basics of the wavelet analysis and how to use them in the
*wavScalogram* package. Then, a description of the wavelet scalogram and
its implementation is given. The following sections are devoted to the
windowed scalogram difference and the scale index, in its original and
windowed versions. Finally we illustrate the use of the package with
some examples in applied problems, such as the analysis of time series
of sunspots or the use of the windowed scalogram difference in the
clustering of time series, particularly the interest rate series of
sovereign bonds.

A wavelet (or mother wavelet) is a function \(\psi \in L^2\left( \mathbb{R}\right)\) with zero average (i.e. \(\int _{\mathbb{R}} \psi =0\)), unit energy (\(\| \psi \| =1\), i.e. normalized) and centered in the neighborhood of \(t=0\) (Mallat 2008). There exists a wide variety of wavelets but in this package we use the following, described in Torrence and Compo (1998) (see Figure 1):

- Morlet: \[\psi _{\textrm{Morlet}}(t)=\pi^{-1/4}e^{\mathrm{i}\omega_0 t}e^{-t^2/2}.\] It is a plane wave modulated by a Gaussian, where the positive parameter \(\omega _0\) denotes the central dimensionless frequency. According to Farge (1992), the wavelet function must fulfil an admissibility condition, which for the Morlet wavelet is only accomplished if some correction factors are added. We take as default value \(\omega _0=6\), for which those correction factors are negligible. Nevertheless, other choices of this parameter can be considered.
- Paul: \[\psi _{\textrm{Paul}}(t)=\frac{\left(2\mathrm{i}\right)^m m!}{\sqrt{\pi\left(2m\right)!}}\left( 1-\mathrm{i}t\right)^{-\left( m+1\right)},\] where \(m\) is a positive integer parameter representing the order. By default, \(m=4\).
- Derivative of a Gaussian (DoG): \[\psi _{\textrm{DoG}}(t)=\frac{\left(-1\right)^{m+1}}{\sqrt{\Gamma\left( m+\frac{1}{2}\right)}}\frac{\mathrm{d}^m}{\mathrm{d}t^m}\left( e^{-t^2/2}\right) ,\] where \(m\) is a positive integer parameter representing the derivative. By default, \(m=2\), that coincides with the Marr or Mexican hat wavelet.

Moreover, we have added:

- Haar, centered at \(0\): \[\psi _{\textrm{Haar}}(t)=\left\{ \begin{array}{rl} 1 & \textrm{if }\,-\frac{1}{2}\leq t <0, \\ -1 & \textrm{if }\,0\leq t <\frac{1}{2}, \\ 0 & \textrm{otherwise}. \end{array} \right.\] This is the simplest wavelet, but it is not continuous.

Scaling a wavelet \(\psi\) by \(s>0\) and translating it by \(u\), we create a family of unit energy “time-frequency atoms”, called daughter wavelets, \(\psi _{u,s}\), as follows \[\label{eq:psius} \psi_{u,s}(t)=\frac{1}{\sqrt{s}}\psi \left( \frac{t-u}{s}\right) . \tag{1}\]

**Remark 1** (Fourier factor). Usually, the Fourier wavelength of a
daughter wavelet does not coincide with its scale \(s\). Nevertheless,
they are proportional, and this proportionality factor for converting
scales into Fourier periods is called Fourier factor. This Fourier
factor is taken \(4\pi/\left( \omega _0+\sqrt{2+\omega_0^2}\right)\),
\(4\pi/\left( 2m+1\right)\) and \(2\pi / \sqrt{m+1/2}\) for Morlet, Paul and
DoG wavelets respectively (Torrence and Compo 1998). For the default parameter values,
the Fourier factor is approximately \(1.033\), \(1.3963\), and \(3.9738\)
respectively. For Haar wavelet, the Fourier factor is \(1\).

Given a function \(f\in L^2\left( \mathbb{R}\right)\), that we will identify with a signal or time series, the continuous wavelet transform (CWT) of \(f\) at time \(u\) and scale \(s>0\) is defined as \[\label{eq:cwt} \mathcal{W}f\left( u,s\right) =\int _{-\infty}^{+\infty}f(t)\psi_{u,s}^* (t) \, \textrm{d}t, \tag{2}\] where \(^*\) denotes the complex conjugate. The CWT allows us to obtain the frequency components (or details) of \(f\) corresponding to scale \(s\) and time location \(u\).

In practical situations, however, it is common to deal with finite signals. That is, given a time signal \(x\), and a finite time interval \([0,T]\), we shall consider the finite sequence \(x_n=x\left(t_n\right)\), for \(n=0,\ldots,N\). Here, \(t_0,\ldots,t_{N}\) is a discretization of the interval \([0,T]\), i.e. \(t_n = nh\), being \(h = T/N\) the time step. According to (1) and (2), the CWT of \(x\) at scale \(s>0\) is defined as the sequence \[\label{eq:wxns} \mathcal{W}x_n(s)=h\sum _{i=0}^{N} x_i \psi _{n,s}^*\left( t_i\right) , \tag{3}\] where \(n=0,\ldots,N\) and \[\label{eq:psins} \psi_{n,s}(t)=\frac{1}{\sqrt{s}}\psi \left( \frac{t-t_n}{s}\right) . \tag{4}\] Note that \(\psi_{n,s}(t)\) is in fact \(\psi_{t_n,s}(t)\), but this abuse of notation between (1) and (4) is assumed for the sake of readability. Using Fourier transform tools, one can calculate (3) for all \(n=0,\ldots N\) simultaneously and efficiently (Torrence and Compo 1998).

**Remark 2** (**Energy density**). It is known that the CWT coefficients
are biased in favour of large scales (Liu et al. 2007). Nevertheless, if the
mother and daughter wavelets are normalized by the \(L^1\)-norm (as in the
*Rwave* package, by Carmona and Torresani (2021))
instead of the \(L^2\)-norm (as in our package), this bias is not produced
. Hence, to rectify the bias, the CWT in (2) and
(3) can be multiplied by the factor \(\frac{1}{\sqrt{s}}\).
This rectification will be specially useful in some wavelet tools of our
package that quantify the “energy density” of a signal, such as the
wavelet power spectrum, the scalograms and the windowed scalogram
difference. On the other hand, in the case of the scale index, this
correction will not be advisable (see Remark 7). Usually,
the wavelet tools of our package have a logical parameter called
`energy_density`

that switches this correction.

For computing the CWT of a time series \(x\) at a given set of scales, we
use `cwt_wst`

. For example,

```
# install.packages("wavScalogram")
library(wavScalogram)
<- 0.1
h <- 1000
N <- seq(from = 0, to = N * h, by = h)
time <- sin(pi * time)
signal <- seq(from = 0.5, to = 4, by = 0.05)
scales <- cwt_wst(signal = signal, dt = h,
cwt scales = scales, powerscales = FALSE,
wname = "DOG", wparam = 6)
```

computes the CWT of `signal`

at `scales`

from \(s_a=0.5\) to \(s_b=4\) using
DoG wavelet with \(m=6\). The parameter `wname`

indicates the wavelet
used, and it can be `"MORLET"`

(default value), `"PAUL"`

, `"DOG"`

,
`"HAAR"`

or `"HAAR2"`

. The difference between these two last values is
that `"HAAR2"`

provides a more accurate but slower algorithm than the
one provided by `"HAAR"`

. Moreover, we can specify by means of `wparam`

the value of the parameters \(\omega_0\) or \(m\). As it has been stated
before, the default values of these parameters are \(\omega_0=6\) for
Morlet wavelet, \(m=4\) for Paul wavelet and \(m=2\) for DoG wavelet.

If the set of scales is a base \(2\) power scales set (Torrence and Compo 1998), the
parameter `scales`

can be a vector of three elements with the lowest
scale \(s_a\), the highest scale \(s_b\) and the number of suboctaves per
octave. This vector is internally passed to function `pow2scales`

that
returns the constructed base \(2\) power scales set. For example,

```
<- c(0.5, 4, 16)
scales <- cwt_wst(signal = signal, dt = h, scales = scales, powerscales = TRUE) cwt
```

computes the CWT of `signal`

at scales from \(s_a=0.5\) to \(s_b=4\), with
\(16\) suboctaves per octave. Since parameter `powerscales`

is `TRUE`

by
default, it is not necessary to specify it in the function call. If
`scales = NULL`

(default value), then the function constructs the scales
set automatically: \(s_a\) is chosen so that its equivalent Fourier period
is \(2h\) (Torrence and Compo 1998), and \(s_b=Nh/2r_w\), where \(r_w\) is the corresponding
*wavelet radius*. Note that in this case \(s_b\) maximizes the length of
the scales interval taking into account the *cone of influence*. The
wavelet radius and the cone of influence are defined and discussed in
Remark 4.

The output `cwt`

is a list containing the following fields:

`coefs`

is an \((N+1)\times\)`length(scales)`

array (either real or complex depending on the wavelet used) containing the corresponding CWT coefficients, i.e.`cwt$coefs[i,j]`

is the CWT coefficient at the i-th time and j-th scale.`scales`

is the vector of scales used, either provided by the user or constructed by the function itself.`fourier_factor`

is the scalar used to transform scales into Fourier periods (see Remark 1).`coi_maxscale`

is a numeric vector of size \(N+1\) that defines the*cone of influence*(see Remark 4).

**Remark 3** (Border effects). In (3) (or (2) for
finite length signals) there appear border effects (or edge effects)
when the support of the daughter wavelets is not entirely contained in
the time domain \(\left[ t_0,t_N\right]\). In order to try to mitigate
border effects, we can construct from the original time series \(x\) an
infinite time series \(\bar{x}\) on \(t_i=t_0+ih\) for \(i\in \mathbb{Z}\) and
then we define
\[\label{eq:wbarxns}
\bar{\mathcal{W}}x_n(s)=\mathcal{W}\bar{x}_n(s)=h\sum _{i\in \mathbb{Z}} \bar{x}_i \psi _{n,s}^*\left( t_i\right) , \tag{5}\]
where \(n=0,\ldots ,N\). The most usual ways to construct \(\bar{x}\) are
the following:

- Padding time series with zeroes: \(\bar{x}_i=x_i\) if \(i\in \left\{ 0,\ldots ,N\right\}\), and \(\bar{x}_i=0\) otherwise. In this case, (3) and (5) are equivalent, having \(\bar{\mathcal{W}}x_n(s)=\mathcal{W}x_n(s)\).
- Using a periodization of the original time series: \(\bar{x}_i=x_{i \,\textrm{mod}\,(N+1)}\).
- Using a symmetric catenation of the original time series: \(\bar{x}_i=x_{i \,\textrm{mod}\,(N+1)}\) if \(\lfloor \frac{i}{N+1}\rfloor\) is even, and \(\bar{x}_i=x_{(N-i) \,\textrm{mod}\,(N+1)}\) if \(\lfloor \frac{i}{N+1}\rfloor\) is odd.

Depending on the nature of \(x\), it may be preferable to use one construction or another for minimizing the undesirable border effects (see Figure 2). For example, a periodization is advised for stationary short time series, and symmetric catenation for non-stationary short time series. On the other hand, for long time series, border effects are less important and then we can just pad with zeroes, i.e. use the original CWT given in (3).

How the border effects are treated by function `cwt_wst`

is determined
via the `border_effects`

parameter. Possible values for this parameter
are `"BE"`

(raw border effects, padding with zeroes), `"PER"`

(periodization) and `"SYM"`

(symmetric catenation), corresponding to the
three options described above.

**Remark 4** (Cone of influence). The cone of influence (CoI) is defined
by the scales for which border effects become important at each time.
The field `coi_maxscale`

of the output in `cwt_wst`

function contains,
for each time, the maximum scale at which border effects are negligible,
and consequently determines the CoI.

In order to compute the CoI, we have to set a criterion to distinguish between relevant and negligible border effects. In Torrence and Compo (1998), the CoI is defined by the \(e\)-folding time for the autocorrelation of wavelet power spectrum (see the next section) at each scale \(s\), and this \(e\)-folding time is chosen so that the wavelet power spectrum for a discontinuity at the edge drops by a factor \(e^{-2}\). For Morlet and DoG wavelets, this \(e\)-folding time is \(\sqrt{2}s\), and for Paul wavelets is \(s/\sqrt{2}\).

For wavelets with symmetric modulus such as Morlet, Paul and DoG, the \(e\)-folding time at \(s=1\) is interpreted as a wavelet radius \(r_w\) that defines an effective support \(\left[ -r_w,r_w\right]\) for the mother wavelet. Therefore, the CoI is given by the scales from which the corresponding effective supports of the daughter wavelets \(\left[ u-sr_w,u+sr_w\right]\) are not entirely contained in the time domain.

The wavelet radius \(r_w\) determines the CoI in the different functions
of this package by means of the parameter `waverad`

. If it is `NULL`

(default value) we consider \(r_w=\sqrt{2}\) for Morlet and DoG wavelets,
and \(r_w=1/\sqrt{2}\) for Paul wavelets, following Torrence and Compo (1998). On the other
hand, we take \(r_w=0.5\) for Haar wavelet, i.e. we assume that its
effective support is in fact its support. Nevertheless we can introduce
a custom \(r_w\) for any wavelet, allowing us in this way to adjust the
importance of border effects in the construction of the CoI. For
example,

```
<- cwt_wst(signal = signal, dt = h,
cwt wname = "DOG", wparam = 6, waverad = 2)
```

computes the CWT coefficients of `signal`

for DoG wavelet with \(m=6\).
Here, `cwt$coi_maxscale`

is obtained assuming that the wavelet radius is
\(r_w=2\). Note that the value of `waverad`

does not affect the
computation of the CWT coefficients.

The wavelet power spectrum of a signal \(f\in L^2\left( \mathbb{R}\right)\) at time \(u\) and scale \(s>0\) is defined as \[\label{eq:wpsfus} \mathcal{WPS}f\left( u,s\right) = |\mathcal{W}f\left( u,s\right) |^2. \tag{6}\] Analogously to (6), the wavelet power spectrum of a time series \(x\) at scale \(s>0\) is given by the sequence \[\label{eq:wpsxns} \mathcal{WPS}x_n(s) = |\mathcal{W}x_n(s) |^2, \tag{7}\] where \(n=0,\ldots,N\).

We can plot the wavelet power spectrum of a time series \(x\) at a given
set of scales through function `cwt_wst`

if the parameter `makefigure`

is `TRUE`

(default value). There are other parameters regarding this
plot:

`time_values`

is a vector that provides customized values in the time axis.`energy_density`

is a logical parameter. If it is`TRUE`

, it is plotted the wavelet power spectrum divided by the scales, according to Remark 2. By default, it is`FALSE`

.`figureperiod`

is a logical parameter that indicates if they are represented periods or scales in the \(y\)-axis (see Remark 1). By default, it is`TRUE`

.`xlab, ylab, main, zlim`

are parameters to customize the figure.

For example,

```
<- 1 / 12
h <- seq(from = 1920, to = 1970 - h, by = h)
time1 <- seq(from = 1970, to = 2020, by = h)
time2 <- c(sin(pi * time1), sin(pi * time2 / 2))
signal <- cwt_wst(signal = signal, dt = h, time_values = c(time1, time2))
cwt_a <- cwt_wst(signal = signal, dt = h, time_values = c(time1, time2),
cwt_b energy_density = TRUE)
```

plots Figure 3 (a) and (b) respectively. In this figure it
is shown how the wavelet power spectrum is biased in favour of large
scales, as it is pointed out in Remark 2. The parameter
`energy_density`

corrects it and so, values for different scales become
comparable. Note that `energy_density`

only affects the plot and hence,
`cwt_a`

is identical to `cwt_b`

.

The scalogram of \(f\) at scale \(s\) is defined as \[\label{eq:sfs} \mathcal{S}f(s)= \left( \int _{-\infty } ^{+\infty } | \mathcal{W}f\left( u,s\right) | ^2 \, \textrm{d}u \right) ^{1/2} . \tag{8}\] It gives the contribution of each scale to the total “energy” of the signal and so, the notion of scalogram here is analogous to the spectrum of the Fourier transform. It is important to note that the term “scalogram” is often used to refer the wavelet power spectrum, but in this package, we call “scalogram” to (8).

If \(f\) is a finite length signal with time domain \(I=\left[ a,b\right]\), it is usual to consider a normalized version of the scalogram for comparison purposes, given by \[\label{eq:sfsn} \mathcal{S}f(s)= \left( \frac{1}{b-a}\int _{a}^{b} | \mathcal{W}f\left( u,s\right) | ^2 \, \textrm{d}u \right) ^{1/2} . \tag{9}\] Hence, according to (9), the (normalized) scalogram of \(x\) at scale \(s\) is given by \[\label{eq:sxsn} \mathcal{S}x(s) = \left( \frac{1}{N+1}\sum _{i=0}^{N} | \mathcal{W}x_i(s) | ^2 \right) ^{1/2}. \tag{10}\] The normalization coefficient \(1/(N+1)\) in (10) allows us to compare scalograms of time series with different lengths.

We can compute the (normalized) scalograms of a time series \(x\) at a
given set of scales by means of function `scalogram`

. This function
follows the same rules as `cwt_wst`

regarding the data entry,
construction of scales, choice of the wavelet, border effects and
application of Fourier factor. So, parameters `signal`

, `dt`

, `scales`

,
`powerscales`

, `wname`

, `wparam`

, `waverad`

, `border_effects`

,
`makefigure`

, `figureperiod`

, `xlab`

, `ylab`

and `main`

are analogous to
those in function `cwt_wst`

with same default values. On the other hand,
parameter `energy_density`

is `TRUE`

by default. For example,

```
<- scalogram(signal = signal, dt = h,
sc_a energy_density = FALSE)
```

computes the (normalized) scalogram of `signal`

given by (10)
at a base \(2\) power scales set constructed automatically and plots
Figure 4 (a). The output `sc_a`

is a list with the following
fields:

`scalog`

is a vector of length`length(scales)`

with the values of the (normalized) scalogram at each scale.`energy`

is the total energy of`scalog`

(i.e. its \(L^2\)-norm) if parameter`energy_density`

is`TRUE`

.`scales`

and`fourier_factor`

are analogous to those in the output of function`cwt_wst`

.