This paper is an introduction to the new package in R called *smoots* (smoothing time series), developed for data-driven local polynomial smoothing of trend-stationary time series. Functions for data-driven estimation of the first and second derivatives of the trend are also built-in. It is first applied to monthly changes of the global temperature. The quarterly US-GDP series shows that this package can also be well applied to a semiparametric multiplicative component model for non-negative time series via the log-transformation. Furthermore, we introduced a semiparametric Log-GARCH and a semiparametric Log-ACD model, which can be easily estimated by the *smoots* package. Of course, this package applies to suitable time series from any other research area. The *smoots* package also provides a useful tool for teaching time series analysis, because many practical time series follow an additive or a multiplicative component model.

This paper provides an introduction to a new package in R called
*smoots* (version 1.0.1, Feng and Schulz 2019) for data-driven local
polynomial smoothing of trend-stationary time series. This package is
developed based on the R codes for the practical implementation of the
proposals in Feng et al. (2020). Detailed discussion on the methodology
and the development of the algorithms may be found in that work. Here,
the main algorithm is a fully data-driven IPI (iterative plug-in, Gasser et al. 1991) approach for estimating the trend under stationary
time series errors, where the variance factor in the asymptotically
optimal bandwidth is estimated by another IPI-approach for a lag-window
estimator of the spectral density following Bühlmann (1996).
Numerous sub-algorithms determined by different options, such as the
choice of the order of local polynomial, the choice of the kernel
weighting functions and the choice of the so-called inflation factors
for estimating the bias in the asymptotically optimal bandwidth, are
included. Further functions for data-driven estimation of the first and
second derivatives of the trend are also developed by adapting the idea
of Feng (2007). A related proposal may be found in
Francisco-Fernández et al. (2004). The algorithms included in the *smoots* package
differ to their proposal in several ways. (1) The IPI-algorithm is
usually superior to the DPI (see Beran et al. 2009 for detailed
discussion in models with i.i.d. errors). (2) In Francisco-Fernández et al. (2004) the
variance factor is estimated under an AR(1) model, which is usually a
misspecification. (3) Data-driven estimation of the derivatives are also
included in the current package. Moreover, the user can obtain any
estimate with fixed bandwidths chosen beforehand. A function for kernel
smoothing is also built-in for comparison.

The *smoots* package complements the Comprehensive R Archive Network
(CRAN), as already existing base R functions or CRAN packages for local
polynomial smoothing either do not implement an automated bandwidth
selection or, if such bandwidth algorithms do exist, they are only
suitable for data with i.i.d. (independently and identically
distributed) errors, which is an assumption that is often violated for
time series. In more detail, notwithstanding that the
*stats* package offers the
functions `lowess()`

and `loess()`

for computing the local polynomial
estimates of the regression function given one or multiple predictor
variables, a smoothing bandwidth has to be selected arbitrarily in both
cases. Similar functionality is provided by the
*locpol*
(Ojeda Cabrera 2018),
*KernSmooth*
(Wand 2021) and
*lokern*
(Herrmann and Maechler 2021) packages, however, derivative estimation
approaches and various bandwidth selection algorithms for the regression
function, such as the leave-one-out estimator (Allen 1974)
and the plug-in algorithm by Ruppert et al. (1995), are built into them.
Moreover, within *lokern*, the bandwidth for estimating the first or
second derivative of the regression function can be selected
automatically. Nevertheless, these traditional data-driven bandwidth
selection approaches are only suitable in case of data with i.i.d.
errors (Altman 1990; Hart 1991; Opsomer 1997)
and can therefore often not be considered for time series. Explicit CRAN
packages for detrending a time series nonparametrically are
*rmaf* (Qiu 2015) and
*mFilter*
(Balcilar 2019). While with *rmaf*, a fully data-driven refined
moving average and a cubic smoothing spline, which requires a manually
selected smoothness parameter, can be applied, the trend as well as the
cyclical component of a time series can be extracted with the filters
implemented in the *mFilter* package like the Baxter-King
(Baxter and King 1999) or Hodrick-Prescott (Hodrick and Prescott 1997) filters
among others. However, all filtering functions in the *mFilter* package
are not fully data-driven. Instead, they make use of default values
recommended within scientific literature. Thus, to our knowledge, there
are not any packages on CRAN that implement a data-driven local
polynomial regression for the trend of a time series or its derivatives.

The methodological and computational background for the *smoots* package
is summarized briefly hereinafter. For further details we refer the
reader to Feng et al. (2020) and references therein. Our focus is to
illustrate the wide application spectrum of the *smoots* package for
semiparametric modeling of time series. We propose to estimate the
nonparametric trend using the current package in the first stage and to
fit e.g. an ARMA model to the residuals with the standard `arima()`

function of the *stats* package in R. This idea is first shown with
monthly Northern Hemisphere temperature changes data obtained from the
website of the National Aeronautics and Space Administration (NASA).
Then the proposal is applied to the log-transformation of the quarterly
US-GDP data, retrieved from the website of the Federal Reserve Bank of
St. Louis, using a semiparametric log-local-linear growth model.
Moreover, two new semiparametric models for financial time series are
proposed to show further application of this package. Firstly, a
Semi-Log-GARCH model is defined by introducing a scale function into the
Log-GARCH (logarithmic GARCH)
(Geweke 1986; Pantula 1986; Milhøj 1987). The latter is a special
extension of the seminal ARCH (autoregressive conditional
heteroskedasticity, Engle 1982) and GARCH (generalized ARCH, Bollerslev 1986) models and is just a slightly restricted ARMA
model for the log-transformation of the squared returns (see e.g. Sucarrat 2019). The usefulness of the Log-GARCH has recently been
rediscovered by Francq et al. (2013). The application of this proposal is
illustrated by the DAX returns collected from Yahoo Finance. Secondly, a
Semi-Log-ACD model is proposed as an extension of the Type I Log-ACD
(Bauwens and Giot 2000), which is closely related to the Semi-Log-GARCH. Like
the ACD (autoregressive conditional duration, Engle and Russell 1998) model, the
Semi-Log-ACD can be applied to different non-negative financial data,
such as realized volatilities and trading volumes. In this paper, this
new model is applied to the CBOE Volatility Index (VIX) obtained from
Yahoo Finance. Datasets for all of those examples are built in the
proposed package. The *smoots* package can be applied to suitable time
series from other research areas. In addition, the *smoots* package
provides a useful tool for teaching time series analysis, which helps a
lecturer to obtain automatically detrended real data examples to show
the application of parametric time series models.

The methods and IPI-algorithms are summarized in the following two sections. The general usage of the R functions is then described in another section. Three further sections illustrate the applications of those functions in simple cases and with respect to the Semi-Log-GARCH and the Semi-Log-ACD.

The main algorithm of the *smoots* package is a fully automatic
non-parametric procedure for estimating a deterministic trend in an
additive time series model with observations \(y_{t}\), \(t=1, ..., n\). The
data under consideration can for example be from environmental
statistics, economics or finance. The basic nonparametric time series
model is defined by
\[\label{eq:Semi-MTS}
y_t= m\left(x_t\right) + \xi_t, \tag{1}\]
where \(x_t=t/n\) denotes the rescaled time, \(m\) is a smooth function and
\(\xi_t\) is a zero mean stationary process. This model defines a
nonparametric approach for trend-stationary time series. Let
\(\gamma_{\xi}\left(\tau\right)\) denote the acf (autocovariances) of
\(\xi_t\). Under the regularity conditions given in Bühlmann (1996),
a data-driven IPI-algorithm for estimating the variance factor in the
asymptotically optimal bandwidth can be developed. As indicated by
Feng et al. (2020), the required conditions are fulfilled by an ARMA
process with finite eighth moment. However, models with long-memory
errors are excluded by those assumptions.

The local polynomial estimator of \(m^{\left(\nu\right)}\left(x\right)\),
the \(\nu\)-th derivative of \(m\), is obtained by minimizing
\[\label{eq:LWSformula}
Q = \sum_{t=1}^{n}\left\{y_{t} - \sum_{j=0}^{p}b_{j}\left(x\right)\left(x_{t} - x\right)^{j}\right\}^{2}W\left(\frac{x_{t} - x}{h}\right), \tag{2}\]
where \(h\) is the bandwidth, \(W\) is a second order kernel function with
compact support \(\left[-1, 1\right]\) and is used here as the weighting
function, and \(p\) is the order of polynomial. It is assumed that \(p-\nu\)
is odd. We have
\(\hat m^{\left(\nu\right)}\left(x\right)=\nu!\hat b_\nu\left(x\right)\),
which is asymptotically equivalent to some kernel regression with a
\(k\)-th order kernel \(K\left(u\right)\) and automatic boundary correction,
where \(k=p+1\). In this paper, the following weighting functions are
considered:
\[\label{eq:kernelfunc}
W\left(u\right) = C_{\mu}\left(1-u^{2}\right)^{\mu}\mathbb{I}_{\left[-1,1\right]}\left(u\right), \tag{3}\]
where \(C_{\mu}\) is a standardization constant that is indeed irrelevant
for calculating the weights in (3) and \(\mu=0,\) 1,
..., is a smoothness parameter. For automatic bandwidth selection using
the *smoots* package, only \(\mu = 0, 1, 2\) and \(3\) are allowed,
corresponding to the use of the uniform, Epanechnikov, bisquare and
triweight kernels, respectively.

An IPI-algorithm is developed based on the asymptotically optimal bandwidth \(h_A\), which is the minimizer of the AMISE (asymptotic mean integrated squared error): \[\label{eq:AMISE} \text{AMISE}\left(h\right) = h^{2\left(k-\nu\right)}\frac{I\left[m^{\left(k\right)}\right]\beta^{2}}{\left(k!\right)^{2}} + \frac{2\pi c_{f}\left(d_{b}-c_{b}\right)R\left(K\right)}{nh^{2\nu+1}}\text{,} \tag{4}\] where \(I\left[m^{\left(k\right)}\right] = \int_{c_{b}}^{d_{b}}\left[m^{\left(k\right)}\left(x\right)\right]^{2}dx\), \(\beta=\int_{-1}^{1}u^{k}K\left(u\right)du\), \(R\left(K\right)=\int_{-1}^{1}K^{2}\left(u\right)du\), \(c_{f} = f\left(0\right)\) is the value of the spectral density at the frequency zero and \(0\le c_b < d_b\le 1\) are introduced to reduce the effect of the estimates at the boundary points. Then we have \[\label{eq:hOpt} h_{A}=n^{-\frac{1}{2k+1}}\left(\frac{2\nu + 1}{2\left(k - \nu\right)}\frac{2\pi c_{f}\left(k!\right)^{2}\left(d_{b}-c_{b}\right)R\left(K\right)}{I\left[m^{\left(k\right)}\right]\beta^{2}}\right)^{\frac{1}{2k+1}}. \tag{5}\]

After estimating and removing the nonparametric trend, any suitable parametric model can be fitted to the residuals for further econometric analysis. For instance, we can assume that \(\xi_t\) follows an ARMA model: \[\label{eq:Semi-ARMA} \xi_t= \varphi_1\xi_{t-1} + ... + \varphi_r\xi_{t-r}+ \psi_1\varepsilon_{t-1} + ... + \psi_s\varepsilon_{t-s}+\varepsilon_t, \tag{6}\] where \(\varepsilon_t\) are i.i.d. innovations. A semiparametric ARMA (Semi-ARMA) model is then defined by (1) and (6).

Since both \(c_{f}\) and \(I\left[m^{\left(k\right)}\right]\) in (5) are unknown, we need to obtain and insert appropriate estimates of these two quantities into this formula to achieve a selected bandwidth \(\hat h_{A}\). However, the estimation of \(c_{f}\) and \(I\left[m^{\left(k\right)}\right]\) requires the use of two additional bandwidths. Hence, iterative bandwidth selection procedures should be employed. The estimation of \(I\left[m^{\left(k\right)}\right]\) is not influenced by the correlation structure, which can be simply obtained by means of established IPI-algorithms for models with independent errors. Consequently, solely a comprehensive review of the estimation approach for \(c_{f}\) will be given.

Let \(r_{t, {\rm V}}=y_t-\hat m_{\rm V}\) denote the residuals obtained from a pilot estimate using a bandwidth \(h_{\rm V}\) and denote the sample acf calculated from \(r_{t, {\rm V}}\) by \(\hat \gamma\left(l\right)\). In this paper we propose to use the following lag-window estimator of \(c_f\): \[\label{eq:cfbw} \hat c_{f, {\rm M}}=\frac{1}{2\pi}\sum_{l=-M}^{M} w_l \hat \gamma\left(l\right), \tag{7}\] where \(w_l=l/\left(M+0.5\right)\) are weights calculated according to some lag-window with the window-width \(M\). And, \(M\) will be selected by the following IPI-algorithm proposed by Feng et al. (2020), which is adjusted from that of Bühlmann (1996) .

Let \(M_0=\left[n/2\right]\) be the initial window-width, where \(\left[\cdot\right]\) denotes the integer part.

Global steps: Estimate \(\int\left(f\left(\lambda\right)^2\right)d\lambda\) in the \(j\)-th iteration following Bühlmann (1996) . Denote by \(\int f^{\left(1\right)}\left(\lambda\right) d\lambda\) the integral of the first generalized derivative of \(f\left(\lambda\right)\). Estimate it using the window-width \(M_j'=\left[M_{j-1}/n^{2/21}\right]\), the proposal in Bühlmann (1996) and the Bartlett-window. Obtain \(M_j\) by inserting this quantity into Eq. (5) in Bühlmann (1996). Carry out this procedure iteratively until convergence is reached or until a maximum of 20 iterations. Denote the selected \(M\) by \(\hat M_G\).

Local adaptation at \(\lambda=0\): Calculate \(\int f^{\left(1\right)}\left(\lambda\right) d\lambda\) again using \(M'=\left[\hat M_G/n^{2/21}\right]\). Obtain the finally selected window-width \(\hat M\) by inserting the estimates into the formula of the local optimal bandwidth at \(\lambda=0\) in (5) of Bühlmann (1996) .

It is proposed to use the Bartlett-window throughout the whole algorithm for simplicity. If \(\hat M\) converges, it is usually not affected by the starting value \(M_0\). Hence, any \(1\le M_0\le\left[n/2\right]\) can be used in the proposed algorithm.

In this subsection the data-driven IPI-algorithm for selecting the bandwidth for local linear and local cubic estimators \(\hat m\), with \(k=2\) and 4, respectively, under correlated errors will be described. Note that, although the variance factor \(c_{f}\) can be estimated well from residuals of \(\hat m\), Feng et al. (2020) showed that the asymptotically optimal bandwidth for estimating \(c_{f}\) should be \({\rm CF}\cdot h_A\), not \(h_A\) itself, where \[\label{eq:CF} {\rm CF}= \left\{2k\left[\frac{2 K\left(0\right)}{R\left(K\right)}- 1\right]\right\}^{1/\left(2k+1\right)} \tag{8}\] is a correction factor to obtain the asymptotically optimal bandwidth for estimating \(\hat c_f\) from \(\hat h\), which is the same as given in Feng and Heiler (2009) and is always bigger than 1. Theoretically, \(c_{f}\) should be estimated from \(r_{t, {\rm V}}\) obtained with the bandwidth \({\rm CF}\cdot \hat h\). The proposed main IPI-algorithm for selecting the bandwidth proceeds as follows.

Start with an initial bandwidth \(h_0\) given beforehand.

Obtain \(r_{t, {\rm V}}\) using \(h_{j-1}\) or \({\rm CF}\cdot h_{j-1}\) and estimate \(c_f\) from \(r_{t, {\rm V}}\) as proposed above.

Let \(\alpha=5/7\) or \(5/9\) for \(p=1\), and \(\alpha=9/11\) or \(9/13\) for \(p=3\), respectively. Estimate \(I\left[m^{\left(k\right)}\right]\) with \(h_{d,j}=h_{j-1}^{\alpha}\) and a local polynomial of order \(p_d=p+2\). We obtain \[\label{eq:bi} h_j=\left(\frac{\left[k!\right]^2}{2k \beta^2} \, \frac{2\pi \hat c_f \left(d_b -c_b\right) R\left(K\right)}{I\left[\hat m^{\left(k\right)}\right]}\right)^{1/\left(2k+1\right)} \cdot n^{-1/\left(2k+1\right)}. \tag{9}\]

Increase \(j\) by 1. Repetitively carry out Steps \(ii)\) and \(iii)\) until convergence is reached or until for example \(J=20\) iterations are achieved. Let \(\hat h=h_{j}\) be the selected bandwidth.

In the developed package the initial bandwidth \(h_0=0.15\) for both \(p=1\)
and \(p=3\) is used. Also, for both \(p=1\) and \(p=3\), the default value
\(c_b=1-d_b=0.05\) is proposed to reduce the effect of the estimates at
the boundary points. That is, the bandwidth is selected only using
\(90\%\) of the observations in the middle part. The bandwidth
\(h_{d,j}=h_{j-1}^{\alpha}\) for estimating the \(k\)-th derivative is
roughly fixed using a so-called EIM (exponential inflation method). The
first \(\alpha\) value is chosen so that
\(I\left[\hat m^{\left(k\right)}\right]\) and hence \(\hat h\) will achieve
their optimal rates of convergence. Now, the algorithm will be denoted
by **AlgA**. If the second \(\alpha\) value is used,
\(\hat m^{\left(k\right)}\), but not \(\hat h\), will achieve its optimal
rate of convergence. This algorithm is called **AlgB**. And \(\hat h\)
selected by **AlgB** is larger than that by **AlgA**. For further
details on those topics we refer the reader to Feng et al. (2020).

The proposed IPI-algorithm can be easily adapted to select bandwidths for estimating \(\hat m^{\left(\nu\right)}\). In the following, only the cases with \(\nu=1\) or \(2\) are considered, where \(c_f\) is estimated by means of a data-driven pilot estimate \(\hat m\) of the order \(p_p\), say. Then \(m^{\left(\nu\right)}\) will be estimated with \(p=\nu+1\) and \(k=\nu+2\). As before, \(m^{\left(k\right)}\) for calculating the bias factor will be estimated with \(p_d=p+2\) and a correspondingly inflated bandwidth. This leads to the following two-stage procedure.

In the first stage \(\hat c_f\) is obtained by the main IPI-algorithm with \(p_p=1\) or \(p_p=3\).

Then an IPI-procedure as proposed above to select the bandwidth for estimating \(m^{\left(\nu\right)}\) according to (5) is carried out with fixed \(\hat c_f\) obtained in i).

Note that \(\hat m^{\left(\nu\right)}\) is asymptotically equivalent to some kernel estimator with boundary correction. Explicit forms of the equivalent kernels for estimating \(m^{\left(\nu\right)}\) in the middle part may be found in Müller (1988). The corresponding inflation factors (i.e. the \(\alpha\) values) are determined by \(p\) or \(k\).

The package *smoots* developed based on the algorithms described in the
last section consists of five directly usable functions, two S3 methods
and four datasets as examples. The first four functions are called
`msmooth()`

, `tsmooth()`

, `gsmooth()`

and `knsmooth()`

, designed for
estimating the trend in different ways. Data-driven estimation can be
carried out by each of the first two functions, where `msmooth()`

is a
user-friendlier simplified version of `tsmooth()`

. Local polynomial
estimation of \(m^{\left(\nu\right)}\) and kernel smoothing of \(m\) with an
arbitrary bandwidth fixed beforehand can be carried out by `gsmooth()`

and `knsmooth()`

, respectively. In the first case, one should choose \(p\)
such that \(p-\nu\) is odd to avoid the boundary problem. Those functions
allow a flexible application of this package by an experienced user.
Data-driven estimation of the first or second derivative can be obtained
by `dsmooth()`

.

The functions for selecting the bandwidth will be described in more detail. Moreover, for simplicity, shared arguments between functions will only be discussed once. With

`tsmooth(y, p, mu, Mcf, InfR, bStart, bvc, bb, cb, method)`

,

the trend in equidistant and trend-stationary time series with short-memory can be estimated via an automated local polynomial. Its arguments are as follows.

`y`

is the input time series.`p`

reflects the order of polynomial and currently only`p = 1`

(local linear) and`p = 3`

(local cubic) are selectable.`mu`

corresponds to the smoothness parameter \(\mu\) of the second order kernel function (3) used for weighting. Only`0`

,`1`

,`2`

and`3`

are valid options.`Mcf`

defines the method for estimating \(c_f\). For`Mcf = "NP"`

, the default, \(c_f\) is estimated nonparametrically as described in the section*Data-driven estimation of \(c_f\)*. If`"AR"`

,`"MA"`

or`"ARMA"`

are selected, \(\xi_t\) in model (1) is assumed to follow an AR, MA or ARMA process, respectively, during the bandwidth selection and thus, \(c_f\) is estimated parametrically.`InfR`

sets the inflation rate \(\alpha\) considered for the bandwidth (see also Step iii) in the section*The IPI-algorithm for estimating \(m\)*). The options are`InfR = "Opt"`

, which corresponds to \(\alpha = 5/7\) for a local linear and \(\alpha = 9/11\) for a local cubic regression,`InfR = "Nai"`

with \(\alpha = 5/9\) and \(\alpha = 9/13\) for local linear and local cubic regressions, respectively, and`InfR = "Var"`

, which always sets \(\alpha = 1/2\).`bStart`

is the (relative) starting bandwidth for the bandwidth selection algorithm. The default is`bStart = 0.15`

. Note that \(\hat{h}\) is usually not affected by the initial value.- With the argument
`bvc`

the estimation of \(c_f\) can be adjusted even further. By setting`bvc = "Y"`

, the bandwidth for estimating \(c_f\) will be enlarged by the factor CF as in (8). CF is omitted for`bvc = "N"`

. `bb`

describes the boundary method. If`bb = 0`

is selected, the number of observations considered for smoothing is decreasing towards the boundaries, while it is constant throughout for`bb = 1`

.`cb`

is the proportion of observations at each boundary that is omitted in the bandwidth selection process.- Via
`method`

the final smoothing method, after the bandwidth has been selected, can be defined. The default`method = "lpr"`

corresponds to local polynomial estimates, whereas with`method = "kr"`

a kernel regression is conducted.

A simplified version of `tsmooth()`

for estimating a time series trend
with a data-driven local polynomial is

`msmooth(y, p, mu, bStart, alg, method)`

,

which shares most of its arguments with `tsmooth()`

. The only unknown
argument is `alg`

.

`alg`

defines specific argument settings of`tsmooth()`

as named subalgorithms. The two algorithms**AlgA**and**AlgB**described in the section*The IPI-algorithm for estimating \(m\)*can be directly chosen by`alg = "A"`

and`alg = "B"`

, respectively.

In accordance with Feng et al. (2020), we propose the use of **AlgA**
with the optimal inflation factor for local linear regression. For local
cubic regression with a moderate \(n\), **AlgB** with the stronger
inflation factor should be used. If \(n\) is big enough, e.g. bigger than
400, the combination of \(p = 3\) and **AlgA** will also lead to suitable
results. In case when slight over smoothing is wished/required, the
inflation factor `"Nai"`

or even `"Var"`

should be employed.

To estimate the first or second derivative of a time series trend with a data-driven local polynomial, the function

`dsmooth(y, d, mu, pp, bStart.p, bStart)`

should be employed. With
this function, the first and second derivatives are always estimated
using a local quadratic and local cubic regression, respectively. To
obtain \(\hat{c}_f\), a data-driven pilot estimate of the trend function
via `tsmooth()`

is computed.

`d`

specifies the order of derivative of the trend that will be estimated. Currently,`d = 1`

and`d = 2`

are valid options.`pp`

corresponds to the order of polynomial considered for the pilot estimate of the trend function.`pp = 1`

and`pp = 3`

are available.`bStart.p`

is the starting bandwidth for the data-driven pilot estimate of the trend.

Note that here, `bStart`

is the initial bandwidth considered in the
bandwidth selection for the derivative series. For further details on
the functions we refer the reader to the user’s guideline for this
package. Beside the above functions, two S3 methods are also built in
the package: a `print()`

and a `plot()`

method for objects of class
`"smoots"`

, a newly introduced class of objects created by the *smoots*
package. They allow for a quick and detailed overview of the estimation
results. The `"smoots"`

objects themselves are generally lists
consisting of different components such as input data and estimation
results.

In this and the next two sections, the wide applicability of the above
mentioned functions will be illustrated by four real data examples.
Those datasets are built in the package so that the reader can also use
them. They are: `tempNH`

, `gdpUS`

, `dax`

and `vix`

, which contain
observations of the mean monthly temperature changes of the Northern
Hemisphere, the US GDP and daily financial data of the German stock
index (DAX) and the CBOE Volatility Index (VIX), respectively. For
further information see also the documentation on the data within the
*smoots* package. Since the package is available on CRAN, the commands
`install.packages("smoots")`

and `library(smoots)`

can be used to
install it and attach it within the current R environment.

To show the application of the additive Semi-ARMA model defined by
(1) and (6), the time series of the mean
monthly Northern Hemisphere temperature changes from 1880 to 2018 (NHTM)
is chosen. The data are downloaded from the website of the NASA. For
this purpose, the function `tsmooth()`

is applied. The used settings of
the arguments for this function are equivalent to those in algorithm *A*
with \(p=1\).

```
<- smoots::tempNH$Change
tempChange <- smoots::tsmooth(tempChange, p = 1, mu = 2, Mcf = "NP", InfR = "Opt",
est_temp bStart = 0.1, bvc = "Y", method = "lpr")
<- smoots::dsmooth(tempChange, d = 1, mu = 2, pp = 3, bStart.p = 0.2,
d1_temp bStart = 0.15)
<- smoots::dsmooth(tempChange, d = 2, mu = 3, pp = 1, bStart.p = 0.1,
d2_temp bStart = 0.2)
```

Figure 1(a) shows the observations together with the estimated trend. Here, the selected bandwidth is 0.1221. We see, the estimated trend fits the data very well. In particular, the trend increases steadily after 1970 that might be a signal for possible global warming in the last decades. The residuals are displayed in Figure 1(b), which look quite stationary. This indicates that Model (1) is a suitable approach for this time series. Moreover, Figures 1(c) and 1(d) illustrate the data-driven estimates of the first and second derivatives of the trend respectively, which fit the features of the trend function very well and provide us further details about the global temperature changes.

`<- stats::arima(est_temp$res, order = c(1, 0, 1), include.mean = FALSE) arma1 `

Consequently, an ARMA(1, 1) model is fitted to the residual series
\(\tilde{\xi}_t = y_t - \hat{m}(x_t)\) using the `arima()`

function in R,
which results in
\[\label{eq:resultEx1}
\tilde{\xi}_{t} = 0.7489\tilde{\xi}_{t-1} - 0.3332\varepsilon_{t-1} +\varepsilon_{t}. \tag{10}\]
We see, the dependence of errors is dominated by a strong positive AR
parameter with a moderate negative MA coefficient.

A well-known approach in developing economics is the log-linear growth
model. Assume that the log-transformation of a macroeconomic time series
follows Models (1) and (6), we achieve a
semiparametric local polynomial, in particular a local-linear extension
of this theory. To show this application, the series of the quarterly
US-GDP from the first quarter of 1947 to the second quarter of 2019,
downloaded from the Federal Reserve Bank of St. Louis, is chosen.
Data-driven local linear regression is applied to estimate the trend
from the log-data using *AlgA* with the selected bandwidth 0.1325. A
kernel regression estimate using the same bandwidth is also carried out
for comparison.

```
<- log(smoots::gdpUS$GDP)
l_gdp <- smoots::msmooth(l_gdp, p = 1, mu = 1, bStart = 0.1, alg = "A",
gdp_t1 method = "lpr")
<- smoots::msmooth(l_gdp, p = 1, mu = 1, bStart = 0.1, alg = "A",
gdp_t2 method = "kr")
<- smoots::dsmooth(l_gdp, d = 1, mu = 1, pp = 1, bStart.p = 0.1,
gdp_d1 bStart = 0.15)
<- smoots::dsmooth(l_gdp, d = 2, mu = 1, pp = 1, bStart.p = 0.1,
gdp_d2 bStart = 0.2)
```

The results together with the log-data are displayed in Figure 2(a). We see that the two trend estimates in the middle part coincide with each other. They differ from each other only at the boundary points and the kernel estimate is affected by a clear boundary problem. Thus, the local linear method should be used. Residuals of this estimate are shown in Figure 2(b). Again, the estimated first and second derivatives are given in Figures 2(c) and 2(d), respectively, which help us to discover more detailed features of the economic development in the US.

`<- stats::arima(gdp_t1$res, order = c(1, 0, 1), include.mean = FALSE) arma2 `

Furthermore, the following ARMA(1, 1) model is obtained from the residuals: \[\label{eq:resultEx2} \tilde{\xi}_{t} = 0.9079\tilde{\xi}_{t-1} + 0.2771\varepsilon_{t-1} + \varepsilon_{t}. \tag{11}\]