The smoots Package in R for Semiparametric Modeling of Trend Stationary Time Series

This paper is an introduction to the new package in R called smoots (smoothing time series), developed for data-driven local polynomial smoothing of trend-stationary time series. Functions for data-driven estimation of the first and second derivatives of the trend are also built-in. It is first applied to monthly changes of the global temperature. The quarterly US-GDP series shows that this package can also be well applied to a semiparametric multiplicative component model for non-negative time series via the log-transformation. Furthermore, we introduced a semiparametric Log-GARCH and a semiparametric Log-ACD model, which can be easily estimated by the smoots package. Of course, this package applies to suitable time series from any other research area. The smoots package also provides a useful tool for teaching time series analysis, because many practical time series follow an additive or a multiplicative component model.

Yuanhua Feng (Paderborn University) , Thomas Gries (Paderborn University) , Sebastian Letmathe (Paderborn University) , Dominik Schulz (Paderborn University)
2022-06-21

1 Introduction

This paper provides an introduction to a new package in R called smoots (version 1.0.1, Feng and Schulz 2019) for data-driven local polynomial smoothing of trend-stationary time series. This package is developed based on the R codes for the practical implementation of the proposals in Feng et al. (2020). Detailed discussion on the methodology and the development of the algorithms may be found in that work. Here, the main algorithm is a fully data-driven IPI (iterative plug-in, Gasser et al. 1991) approach for estimating the trend under stationary time series errors, where the variance factor in the asymptotically optimal bandwidth is estimated by another IPI-approach for a lag-window estimator of the spectral density following Bühlmann (1996). Numerous sub-algorithms determined by different options, such as the choice of the order of local polynomial, the choice of the kernel weighting functions and the choice of the so-called inflation factors for estimating the bias in the asymptotically optimal bandwidth, are included. Further functions for data-driven estimation of the first and second derivatives of the trend are also developed by adapting the idea of Feng (2007). A related proposal may be found in Francisco-Fernández et al. (2004). The algorithms included in the smoots package differ to their proposal in several ways. (1) The IPI-algorithm is usually superior to the DPI (see Beran et al. 2009 for detailed discussion in models with i.i.d. errors). (2) In Francisco-Fernández et al. (2004) the variance factor is estimated under an AR(1) model, which is usually a misspecification. (3) Data-driven estimation of the derivatives are also included in the current package. Moreover, the user can obtain any estimate with fixed bandwidths chosen beforehand. A function for kernel smoothing is also built-in for comparison.

The smoots package complements the Comprehensive R Archive Network (CRAN), as already existing base R functions or CRAN packages for local polynomial smoothing either do not implement an automated bandwidth selection or, if such bandwidth algorithms do exist, they are only suitable for data with i.i.d. (independently and identically distributed) errors, which is an assumption that is often violated for time series. In more detail, notwithstanding that the stats package offers the functions lowess() and loess() for computing the local polynomial estimates of the regression function given one or multiple predictor variables, a smoothing bandwidth has to be selected arbitrarily in both cases. Similar functionality is provided by the locpol (Ojeda Cabrera 2018), KernSmooth (Wand 2021) and lokern (Herrmann and Maechler 2021) packages, however, derivative estimation approaches and various bandwidth selection algorithms for the regression function, such as the leave-one-out estimator (Allen 1974) and the plug-in algorithm by Ruppert et al. (1995), are built into them. Moreover, within lokern, the bandwidth for estimating the first or second derivative of the regression function can be selected automatically. Nevertheless, these traditional data-driven bandwidth selection approaches are only suitable in case of data with i.i.d. errors (Altman 1990; Hart 1991; Opsomer 1997) and can therefore often not be considered for time series. Explicit CRAN packages for detrending a time series nonparametrically are rmaf (Qiu 2015) and mFilter (Balcilar 2019). While with rmaf, a fully data-driven refined moving average and a cubic smoothing spline, which requires a manually selected smoothness parameter, can be applied, the trend as well as the cyclical component of a time series can be extracted with the filters implemented in the mFilter package like the Baxter-King (Baxter and King 1999) or Hodrick-Prescott (Hodrick and Prescott 1997) filters among others. However, all filtering functions in the mFilter package are not fully data-driven. Instead, they make use of default values recommended within scientific literature. Thus, to our knowledge, there are not any packages on CRAN that implement a data-driven local polynomial regression for the trend of a time series or its derivatives.

The methodological and computational background for the smoots package is summarized briefly hereinafter. For further details we refer the reader to Feng et al. (2020) and references therein. Our focus is to illustrate the wide application spectrum of the smoots package for semiparametric modeling of time series. We propose to estimate the nonparametric trend using the current package in the first stage and to fit e.g. an ARMA model to the residuals with the standard arima() function of the stats package in R. This idea is first shown with monthly Northern Hemisphere temperature changes data obtained from the website of the National Aeronautics and Space Administration (NASA). Then the proposal is applied to the log-transformation of the quarterly US-GDP data, retrieved from the website of the Federal Reserve Bank of St. Louis, using a semiparametric log-local-linear growth model. Moreover, two new semiparametric models for financial time series are proposed to show further application of this package. Firstly, a Semi-Log-GARCH model is defined by introducing a scale function into the Log-GARCH (logarithmic GARCH) (Geweke 1986; Pantula 1986; Milhøj 1987). The latter is a special extension of the seminal ARCH (autoregressive conditional heteroskedasticity, Engle 1982) and GARCH (generalized ARCH, Bollerslev 1986) models and is just a slightly restricted ARMA model for the log-transformation of the squared returns (see e.g. Sucarrat 2019). The usefulness of the Log-GARCH has recently been rediscovered by Francq et al. (2013). The application of this proposal is illustrated by the DAX returns collected from Yahoo Finance. Secondly, a Semi-Log-ACD model is proposed as an extension of the Type I Log-ACD (Bauwens and Giot 2000), which is closely related to the Semi-Log-GARCH. Like the ACD (autoregressive conditional duration, Engle and Russell 1998) model, the Semi-Log-ACD can be applied to different non-negative financial data, such as realized volatilities and trading volumes. In this paper, this new model is applied to the CBOE Volatility Index (VIX) obtained from Yahoo Finance. Datasets for all of those examples are built in the proposed package. The smoots package can be applied to suitable time series from other research areas. In addition, the smoots package provides a useful tool for teaching time series analysis, which helps a lecturer to obtain automatically detrended real data examples to show the application of parametric time series models.

The methods and IPI-algorithms are summarized in the following two sections. The general usage of the R functions is then described in another section. Three further sections illustrate the applications of those functions in simple cases and with respect to the Semi-Log-GARCH and the Semi-Log-ACD.

2 Local polynomial regression for time series

The main algorithm of the smoots package is a fully automatic non-parametric procedure for estimating a deterministic trend in an additive time series model with observations \(y_{t}\), \(t=1, ..., n\). The data under consideration can for example be from environmental statistics, economics or finance. The basic nonparametric time series model is defined by \[\label{eq:Semi-MTS} y_t= m\left(x_t\right) + \xi_t, \tag{1}\] where \(x_t=t/n\) denotes the rescaled time, \(m\) is a smooth function and \(\xi_t\) is a zero mean stationary process. This model defines a nonparametric approach for trend-stationary time series. Let \(\gamma_{\xi}\left(\tau\right)\) denote the acf (autocovariances) of \(\xi_t\). Under the regularity conditions given in Bühlmann (1996), a data-driven IPI-algorithm for estimating the variance factor in the asymptotically optimal bandwidth can be developed. As indicated by Feng et al. (2020), the required conditions are fulfilled by an ARMA process with finite eighth moment. However, models with long-memory errors are excluded by those assumptions.

The local polynomial estimator of \(m^{\left(\nu\right)}\left(x\right)\), the \(\nu\)-th derivative of \(m\), is obtained by minimizing \[\label{eq:LWSformula} Q = \sum_{t=1}^{n}\left\{y_{t} - \sum_{j=0}^{p}b_{j}\left(x\right)\left(x_{t} - x\right)^{j}\right\}^{2}W\left(\frac{x_{t} - x}{h}\right), \tag{2}\] where \(h\) is the bandwidth, \(W\) is a second order kernel function with compact support \(\left[-1, 1\right]\) and is used here as the weighting function, and \(p\) is the order of polynomial. It is assumed that \(p-\nu\) is odd. We have \(\hat m^{\left(\nu\right)}\left(x\right)=\nu!\hat b_\nu\left(x\right)\), which is asymptotically equivalent to some kernel regression with a \(k\)-th order kernel \(K\left(u\right)\) and automatic boundary correction, where \(k=p+1\). In this paper, the following weighting functions are considered: \[\label{eq:kernelfunc} W\left(u\right) = C_{\mu}\left(1-u^{2}\right)^{\mu}\mathbb{I}_{\left[-1,1\right]}\left(u\right), \tag{3}\] where \(C_{\mu}\) is a standardization constant that is indeed irrelevant for calculating the weights in (3) and \(\mu=0,\) 1, ..., is a smoothness parameter. For automatic bandwidth selection using the smoots package, only \(\mu = 0, 1, 2\) and \(3\) are allowed, corresponding to the use of the uniform, Epanechnikov, bisquare and triweight kernels, respectively.

An IPI-algorithm is developed based on the asymptotically optimal bandwidth \(h_A\), which is the minimizer of the AMISE (asymptotic mean integrated squared error): \[\label{eq:AMISE} \text{AMISE}\left(h\right) = h^{2\left(k-\nu\right)}\frac{I\left[m^{\left(k\right)}\right]\beta^{2}}{\left(k!\right)^{2}} + \frac{2\pi c_{f}\left(d_{b}-c_{b}\right)R\left(K\right)}{nh^{2\nu+1}}\text{,} \tag{4}\] where \(I\left[m^{\left(k\right)}\right] = \int_{c_{b}}^{d_{b}}\left[m^{\left(k\right)}\left(x\right)\right]^{2}dx\), \(\beta=\int_{-1}^{1}u^{k}K\left(u\right)du\), \(R\left(K\right)=\int_{-1}^{1}K^{2}\left(u\right)du\), \(c_{f} = f\left(0\right)\) is the value of the spectral density at the frequency zero and \(0\le c_b < d_b\le 1\) are introduced to reduce the effect of the estimates at the boundary points. Then we have \[\label{eq:hOpt} h_{A}=n^{-\frac{1}{2k+1}}\left(\frac{2\nu + 1}{2\left(k - \nu\right)}\frac{2\pi c_{f}\left(k!\right)^{2}\left(d_{b}-c_{b}\right)R\left(K\right)}{I\left[m^{\left(k\right)}\right]\beta^{2}}\right)^{\frac{1}{2k+1}}. \tag{5}\]

After estimating and removing the nonparametric trend, any suitable parametric model can be fitted to the residuals for further econometric analysis. For instance, we can assume that \(\xi_t\) follows an ARMA model: \[\label{eq:Semi-ARMA} \xi_t= \varphi_1\xi_{t-1} + ... + \varphi_r\xi_{t-r}+ \psi_1\varepsilon_{t-1} + ... + \psi_s\varepsilon_{t-s}+\varepsilon_t, \tag{6}\] where \(\varepsilon_t\) are i.i.d. innovations. A semiparametric ARMA (Semi-ARMA) model is then defined by (1) and (6).

3 The proposed IPI-algorithms

Since both \(c_{f}\) and \(I\left[m^{\left(k\right)}\right]\) in (5) are unknown, we need to obtain and insert appropriate estimates of these two quantities into this formula to achieve a selected bandwidth \(\hat h_{A}\). However, the estimation of \(c_{f}\) and \(I\left[m^{\left(k\right)}\right]\) requires the use of two additional bandwidths. Hence, iterative bandwidth selection procedures should be employed. The estimation of \(I\left[m^{\left(k\right)}\right]\) is not influenced by the correlation structure, which can be simply obtained by means of established IPI-algorithms for models with independent errors. Consequently, solely a comprehensive review of the estimation approach for \(c_{f}\) will be given.

Data-driven estimation of \(c_f\)

Let \(r_{t, {\rm V}}=y_t-\hat m_{\rm V}\) denote the residuals obtained from a pilot estimate using a bandwidth \(h_{\rm V}\) and denote the sample acf calculated from \(r_{t, {\rm V}}\) by \(\hat \gamma\left(l\right)\). In this paper we propose to use the following lag-window estimator of \(c_f\): \[\label{eq:cfbw} \hat c_{f, {\rm M}}=\frac{1}{2\pi}\sum_{l=-M}^{M} w_l \hat \gamma\left(l\right), \tag{7}\] where \(w_l=l/\left(M+0.5\right)\) are weights calculated according to some lag-window with the window-width \(M\). And, \(M\) will be selected by the following IPI-algorithm proposed by Feng et al. (2020), which is adjusted from that of Bühlmann (1996) .

  1. Let \(M_0=\left[n/2\right]\) be the initial window-width, where \(\left[\cdot\right]\) denotes the integer part.

  2. Global steps: Estimate \(\int\left(f\left(\lambda\right)^2\right)d\lambda\) in the \(j\)-th iteration following Bühlmann (1996) . Denote by \(\int f^{\left(1\right)}\left(\lambda\right) d\lambda\) the integral of the first generalized derivative of \(f\left(\lambda\right)\). Estimate it using the window-width \(M_j'=\left[M_{j-1}/n^{2/21}\right]\), the proposal in Bühlmann (1996) and the Bartlett-window. Obtain \(M_j\) by inserting this quantity into Eq. (5) in Bühlmann (1996). Carry out this procedure iteratively until convergence is reached or until a maximum of 20 iterations. Denote the selected \(M\) by \(\hat M_G\).

  3. Local adaptation at \(\lambda=0\): Calculate \(\int f^{\left(1\right)}\left(\lambda\right) d\lambda\) again using \(M'=\left[\hat M_G/n^{2/21}\right]\). Obtain the finally selected window-width \(\hat M\) by inserting the estimates into the formula of the local optimal bandwidth at \(\lambda=0\) in (5) of Bühlmann (1996) .

It is proposed to use the Bartlett-window throughout the whole algorithm for simplicity. If \(\hat M\) converges, it is usually not affected by the starting value \(M_0\). Hence, any \(1\le M_0\le\left[n/2\right]\) can be used in the proposed algorithm.

The IPI-algorithm for estimating \(m\)

In this subsection the data-driven IPI-algorithm for selecting the bandwidth for local linear and local cubic estimators \(\hat m\), with \(k=2\) and 4, respectively, under correlated errors will be described. Note that, although the variance factor \(c_{f}\) can be estimated well from residuals of \(\hat m\), Feng et al. (2020) showed that the asymptotically optimal bandwidth for estimating \(c_{f}\) should be \({\rm CF}\cdot h_A\), not \(h_A\) itself, where \[\label{eq:CF} {\rm CF}= \left\{2k\left[\frac{2 K\left(0\right)}{R\left(K\right)}- 1\right]\right\}^{1/\left(2k+1\right)} \tag{8}\] is a correction factor to obtain the asymptotically optimal bandwidth for estimating \(\hat c_f\) from \(\hat h\), which is the same as given in Feng and Heiler (2009) and is always bigger than 1. Theoretically, \(c_{f}\) should be estimated from \(r_{t, {\rm V}}\) obtained with the bandwidth \({\rm CF}\cdot \hat h\). The proposed main IPI-algorithm for selecting the bandwidth proceeds as follows.

  1. Start with an initial bandwidth \(h_0\) given beforehand.

  2. Obtain \(r_{t, {\rm V}}\) using \(h_{j-1}\) or \({\rm CF}\cdot h_{j-1}\) and estimate \(c_f\) from \(r_{t, {\rm V}}\) as proposed above.

  3. Let \(\alpha=5/7\) or \(5/9\) for \(p=1\), and \(\alpha=9/11\) or \(9/13\) for \(p=3\), respectively. Estimate \(I\left[m^{\left(k\right)}\right]\) with \(h_{d,j}=h_{j-1}^{\alpha}\) and a local polynomial of order \(p_d=p+2\). We obtain \[\label{eq:bi} h_j=\left(\frac{\left[k!\right]^2}{2k \beta^2} \, \frac{2\pi \hat c_f \left(d_b -c_b\right) R\left(K\right)}{I\left[\hat m^{\left(k\right)}\right]}\right)^{1/\left(2k+1\right)} \cdot n^{-1/\left(2k+1\right)}. \tag{9}\]

  4. Increase \(j\) by 1. Repetitively carry out Steps \(ii)\) and \(iii)\) until convergence is reached or until for example \(J=20\) iterations are achieved. Let \(\hat h=h_{j}\) be the selected bandwidth.

In the developed package the initial bandwidth \(h_0=0.15\) for both \(p=1\) and \(p=3\) is used. Also, for both \(p=1\) and \(p=3\), the default value \(c_b=1-d_b=0.05\) is proposed to reduce the effect of the estimates at the boundary points. That is, the bandwidth is selected only using \(90\%\) of the observations in the middle part. The bandwidth \(h_{d,j}=h_{j-1}^{\alpha}\) for estimating the \(k\)-th derivative is roughly fixed using a so-called EIM (exponential inflation method). The first \(\alpha\) value is chosen so that \(I\left[\hat m^{\left(k\right)}\right]\) and hence \(\hat h\) will achieve their optimal rates of convergence. Now, the algorithm will be denoted by AlgA. If the second \(\alpha\) value is used, \(\hat m^{\left(k\right)}\), but not \(\hat h\), will achieve its optimal rate of convergence. This algorithm is called AlgB. And \(\hat h\) selected by AlgB is larger than that by AlgA. For further details on those topics we refer the reader to Feng et al. (2020).

Data-driven estimation of \(m'\) and \(m''\)

The proposed IPI-algorithm can be easily adapted to select bandwidths for estimating \(\hat m^{\left(\nu\right)}\). In the following, only the cases with \(\nu=1\) or \(2\) are considered, where \(c_f\) is estimated by means of a data-driven pilot estimate \(\hat m\) of the order \(p_p\), say. Then \(m^{\left(\nu\right)}\) will be estimated with \(p=\nu+1\) and \(k=\nu+2\). As before, \(m^{\left(k\right)}\) for calculating the bias factor will be estimated with \(p_d=p+2\) and a correspondingly inflated bandwidth. This leads to the following two-stage procedure.

  1. In the first stage \(\hat c_f\) is obtained by the main IPI-algorithm with \(p_p=1\) or \(p_p=3\).

  2. Then an IPI-procedure as proposed above to select the bandwidth for estimating \(m^{\left(\nu\right)}\) according to (5) is carried out with fixed \(\hat c_f\) obtained in i).

Note that \(\hat m^{\left(\nu\right)}\) is asymptotically equivalent to some kernel estimator with boundary correction. Explicit forms of the equivalent kernels for estimating \(m^{\left(\nu\right)}\) in the middle part may be found in Müller (1988). The corresponding inflation factors (i.e. the \(\alpha\) values) are determined by \(p\) or \(k\).

4 Practical implementation in R

The package smoots developed based on the algorithms described in the last section consists of five directly usable functions, two S3 methods and four datasets as examples. The first four functions are called msmooth(), tsmooth(), gsmooth() and knsmooth(), designed for estimating the trend in different ways. Data-driven estimation can be carried out by each of the first two functions, where msmooth() is a user-friendlier simplified version of tsmooth(). Local polynomial estimation of \(m^{\left(\nu\right)}\) and kernel smoothing of \(m\) with an arbitrary bandwidth fixed beforehand can be carried out by gsmooth() and knsmooth(), respectively. In the first case, one should choose \(p\) such that \(p-\nu\) is odd to avoid the boundary problem. Those functions allow a flexible application of this package by an experienced user. Data-driven estimation of the first or second derivative can be obtained by dsmooth().

The functions for selecting the bandwidth will be described in more detail. Moreover, for simplicity, shared arguments between functions will only be discussed once. With

tsmooth(y,  p,  mu,  Mcf,  InfR,  bStart,  bvc,  bb,  cb,  method),

the trend in equidistant and trend-stationary time series with short-memory can be estimated via an automated local polynomial. Its arguments are as follows.

A simplified version of tsmooth() for estimating a time series trend with a data-driven local polynomial is

msmooth(y,  p,  mu,  bStart,  alg,  method),

which shares most of its arguments with tsmooth(). The only unknown argument is alg.

In accordance with Feng et al. (2020), we propose the use of AlgA with the optimal inflation factor for local linear regression. For local cubic regression with a moderate \(n\), AlgB with the stronger inflation factor should be used. If \(n\) is big enough, e.g. bigger than 400, the combination of \(p = 3\) and AlgA will also lead to suitable results. In case when slight over smoothing is wished/required, the inflation factor "Nai" or even "Var" should be employed.

To estimate the first or second derivative of a time series trend with a data-driven local polynomial, the function

dsmooth(y,  d,  mu,  pp,  bStart.p,  bStart) should be employed. With this function, the first and second derivatives are always estimated using a local quadratic and local cubic regression, respectively. To obtain \(\hat{c}_f\), a data-driven pilot estimate of the trend function via tsmooth() is computed.

Note that here, bStart is the initial bandwidth considered in the bandwidth selection for the derivative series. For further details on the functions we refer the reader to the user’s guideline for this package. Beside the above functions, two S3 methods are also built in the package: a print() and a plot() method for objects of class "smoots", a newly introduced class of objects created by the smoots package. They allow for a quick and detailed overview of the estimation results. The "smoots" objects themselves are generally lists consisting of different components such as input data and estimation results.

5 Simple application of smoots

In this and the next two sections, the wide applicability of the above mentioned functions will be illustrated by four real data examples. Those datasets are built in the package so that the reader can also use them. They are: tempNH, gdpUS, dax and vix, which contain observations of the mean monthly temperature changes of the Northern Hemisphere, the US GDP and daily financial data of the German stock index (DAX) and the CBOE Volatility Index (VIX), respectively. For further information see also the documentation on the data within the smoots package. Since the package is available on CRAN, the commands install.packages("smoots") and library(smoots) can be used to install it and attach it within the current R environment.

Direct application of the Semi-ARMA

To show the application of the additive Semi-ARMA model defined by (1) and (6), the time series of the mean monthly Northern Hemisphere temperature changes from 1880 to 2018 (NHTM) is chosen. The data are downloaded from the website of the NASA. For this purpose, the function tsmooth() is applied. The used settings of the arguments for this function are equivalent to those in algorithm A with \(p=1\).

  tempChange <- smoots::tempNH$Change
  est_temp <- smoots::tsmooth(tempChange, p = 1, mu = 2, Mcf = "NP", InfR = "Opt", 
    bStart = 0.1, bvc = "Y", method = "lpr")
  d1_temp <- smoots::dsmooth(tempChange, d = 1, mu = 2, pp = 3, bStart.p = 0.2, 
    bStart = 0.15)
  d2_temp <- smoots::dsmooth(tempChange, d = 2, mu = 3, pp = 1, bStart.p = 0.1, 
    bStart = 0.2)

Figure 1(a) shows the observations together with the estimated trend. Here, the selected bandwidth is 0.1221. We see, the estimated trend fits the data very well. In particular, the trend increases steadily after 1970 that might be a signal for possible global warming in the last decades. The residuals are displayed in Figure 1(b), which look quite stationary. This indicates that Model (1) is a suitable approach for this time series. Moreover, Figures 1(c) and 1(d) illustrate the data-driven estimates of the first and second derivatives of the trend respectively, which fit the features of the trend function very well and provide us further details about the global temperature changes.

  arma1 <- stats::arima(est_temp$res, order = c(1, 0, 1), include.mean = FALSE) 

Consequently, an ARMA(1, 1) model is fitted to the residual series \(\tilde{\xi}_t = y_t - \hat{m}(x_t)\) using the arima() function in R, which results in \[\label{eq:resultEx1} \tilde{\xi}_{t} = 0.7489\tilde{\xi}_{t-1} - 0.3332\varepsilon_{t-1} +\varepsilon_{t}. \tag{10}\] We see, the dependence of errors is dominated by a strong positive AR parameter with a moderate negative MA coefficient.

graphic without alt text
Figure 1: The NHTM series and the estimated trend are displayed in (a). The residuals, as well as the estimated first and second derivatives are shown in (b), (c) and (d), respectively.

A semiparametric log-local-linear growth model

A well-known approach in developing economics is the log-linear growth model. Assume that the log-transformation of a macroeconomic time series follows Models (1) and (6), we achieve a semiparametric local polynomial, in particular a local-linear extension of this theory. To show this application, the series of the quarterly US-GDP from the first quarter of 1947 to the second quarter of 2019, downloaded from the Federal Reserve Bank of St. Louis, is chosen. Data-driven local linear regression is applied to estimate the trend from the log-data using AlgA with the selected bandwidth 0.1325. A kernel regression estimate using the same bandwidth is also carried out for comparison.

  l_gdp <- log(smoots::gdpUS$GDP)  
  gdp_t1 <- smoots::msmooth(l_gdp, p = 1, mu = 1, bStart = 0.1, alg = "A", 
    method = "lpr")
  gdp_t2 <- smoots::msmooth(l_gdp, p = 1, mu = 1, bStart = 0.1, alg = "A", 
    method = "kr")
  gdp_d1 <- smoots::dsmooth(l_gdp, d = 1, mu = 1, pp = 1, bStart.p = 0.1, 
    bStart = 0.15)
  gdp_d2 <- smoots::dsmooth(l_gdp, d = 2, mu = 1, pp = 1, bStart.p = 0.1, 
    bStart = 0.2) 

The results together with the log-data are displayed in Figure 2(a). We see that the two trend estimates in the middle part coincide with each other. They differ from each other only at the boundary points and the kernel estimate is affected by a clear boundary problem. Thus, the local linear method should be used. Residuals of this estimate are shown in Figure 2(b). Again, the estimated first and second derivatives are given in Figures 2(c) and 2(d), respectively, which help us to discover more detailed features of the economic development in the US.

  arma2 <- stats::arima(gdp_t1$res, order = c(1, 0, 1), include.mean = FALSE)

Furthermore, the following ARMA(1, 1) model is obtained from the residuals: \[\label{eq:resultEx2} \tilde{\xi}_{t} = 0.9079\tilde{\xi}_{t-1} + 0.2771\varepsilon_{t-1} + \varepsilon_{t}. \tag{11}\]