This paper introduces the R package slm, which stands for Stationary Linear Models. The package contains a set of statistical procedures for linear regression in the general context where the error process is strictly stationary with a short memory. We work in the setting of (Hannan 1973), who proved the asymptotic normality of the (normalized) least squares estimators (LSE) under very mild conditions on the error process. We propose different ways to estimate the asymptotic covariance matrix of the LSE and then to correct the type I error rates of the usual tests on the parameters (as well as confidence intervals). The procedures are evaluated through different sets of simulations.
We consider the usual linear regression model \[Y = X \beta + \varepsilon \, ,\] where \(Y\) is the \(n\)-dimensional vector of observations, \(X\) is a (possibly random) \(n \times p\) design matrix, \(\beta\) is a \(p\)-dimensional vector of parameters, and \(\varepsilon= (\varepsilon_i)_{1 \leq i \leq n}\) is the error process (with zero mean and independent of \(X\)). The standard assumptions are that the \(\varepsilon_i\)’s are independent and identically distributed (i.i.d.) with zero mean and finite variance.
In this paper, we propose to modify the standard statistical procedures (tests, confidence intervals, …) of the linear model in the more general context where the \(\varepsilon_i\)’s are obtained from a strictly stationary process \((\varepsilon_i)_{i \in {\mathbb N}}\) with a short memory. To be more precise, let \(\hat \beta\) denote the usual least squares estimator of \(\beta\). Our approach is based on two papers: the paper by (Hannan 1973) who proved the asymptotic normality of the least squares estimator \(D(n)(\hat \beta - \beta)\) (\(D(n)\) being the usual normalization) under very mild conditions on the design and on the error process; and a recent paper by (Caron 2019) who showed that, under Hannan’s conditions, the asymptotic covariance matrix of \(D(n)(\hat \beta - \beta)\) can be consistently estimated.
Let us emphasize that Hannan’s conditions on the error process are very mild and are satisfied for most of the short-memory processes (see the discussion in Section \(4.4\) of (Caron and Dede 2018)). Putting together the two above results, we can develop a general methodology for tests and confidence regions on the parameter \(\beta\), which should be valid for most of the short-memory processes. This is, of course, directly useful for time-series regression, but also in the more general context where the residuals of the linear model seem to be strongly correlated. More precisely, when checking the residuals of the linear model, if the autocorrelation function of the residuals shows significant correlations, and if the residuals can be suitably modeled by an ARMA process, then our methodology is likely to apply. We shall give an example of such a situation on the "Shanghai pollution" dataset at the end of the paper.
Hence, the tools presented in the present paper can be seen from two different points of view:
as appropriate tools for time series regression with a short memory error process
as a way to robustify the usual statistical procedures when the residuals are correlated.
Let us now describe the organization of the paper. In the next section,
we recall the mathematical background, the consistent estimator of the
asymptotic covariance matrix introduced in (Caron 2019), and the
modified \(Z\)-statistics and \(\chi\)-square statistics for testing the
hypothesis on the parameter \(\beta\). Next, we present the slm package
and the different ways to estimate the asymptotic covariance matrix: by
fitting an autoregressive process on the residuals (default procedure),
by means of the kernel estimator described in (Caron 2019)
(theoretically valid) with a bootstrap method to choose the bandwidth
((Wu and Pourahmadi 2009)), by using alternative choices of the bandwidth for
the rectangular kernel ((Efromovich 1998)) and the quadratic
spectral kernel ((Andrews 1991)), and by means of an
adaptive estimator of the spectral density via Histograms
((Comte 2001)). In a section about numerical experiments, we
estimate the level of a \(\chi\)-square test for a linear model with
random design, with different kinds of error processes, and for
different estimation procedures. In the last section, we apply the
package to the "Shanghai pollution" dataset, and we compare the
summary output of slm
with the usual summary output of lm
. An
extended version of this paper is available as an arXiv preprint (see
(Caron et al. 2019)).
We start this section by giving a short presentation of linear regression with stationary errors, more details can be found for instance in (Caron 2019). Let \(\hat{\beta}\) be the usual least squares estimator for the unknown vector \(\beta\). The aim is to provide hypothesis tests and confidence regions for \(\beta\) in the non i.i.d. context.
Let \(\gamma\) be the autocovariance function of the error process \(\varepsilon\): for any integers \(k\) and \(m\), let \(\gamma(k) = \mathrm{Cov}(\varepsilon_{m}, \varepsilon_{m+k})\). We also introduce the covariance matrix: \[\label{gamma_th} \Gamma_{n} := \left[ \gamma(j-l) \right]_{1 \leq j,l \leq n}. \tag{1}\]
(Hannan 1973) has shown a Central Limit Theorem for \(\hat{\beta}\) when the error process is strictly stationary, under very mild conditions on the design and the error process. Let us notice that the design can be random or deterministic. We introduce the normalization matrix \(D(n)\) which is a diagonal matrix with diagonal term \(d_{j}(n) = \left \| X_{.,j} \right \|_{2}\) for \(j\) in \(\{1, \ldots, p\}\), where \(X_{.,j}\) is the \(j\)th column of \(X\). Roughly speaking Hannan’s result says in particular that, given the design \(X\), the vector \(D(n)(\hat{\beta} - \beta)\) converges in distribution to a centered Gaussian distribution with covariance matrix \(C\). As usual, in practice, the covariance matrix \(C\) is unknown, and it has to be estimated. Hannan also showed the convergence of second order moment:1 \[\label{second_order_moment} \mathbb{E} \left( D(n) (\hat{\beta} - \beta) (\hat{\beta} - \beta)^{t} D(n)^{t} \Big| X \right) \xrightarrow[n \rightarrow \infty]{} C, \quad a.s. \tag{2}\] where \[\mathbb{E} \left( D(n) (\hat{\beta} - \beta) (\hat{\beta} - \beta)^{t} D(n)^{t} \Big| X \right) = D(n) (X^{t} X)^{-1} X^{t} \Gamma_{n} X (X^{t} X)^{-1} D(n) .\] In this paper, we propose a general plug-in approach: for some given estimator \(\widehat{\Gamma}_{n}\) of \(\Gamma_{n}\), we introduce the plug-in estimator: \[\label{main_estimator} \widehat C = \widehat C (\widehat{\Gamma}_{n}) := D(n) (X^{t}X)^{-1} X^{t} \widehat{\Gamma}_{n} X (X^{t}X)^{-1} D(n), \tag{3}\] and we use \(\widehat C\) to standardize the usual statistics considered for the study of linear regression.
Let us illustrate this plug-in approach with a kernel estimator which has been proposed in (Caron 2019). For some \(K\) and a bandwidth \(h\), the kernel estimator \(\widetilde{\Gamma}_{n,h}\) is defined by \[\label{Gamma_tapered_star} \widetilde{\Gamma}_{n,h} = \left[ K \left( \frac{j-l}{h} \right) \tilde{\gamma}_{j-l} \right]_{1 \leq j,l \leq n}, \tag{4}\] where the residual-based empirical covariance coefficients are defined for \(0 \leq | k | \leq n-1\) by \[\label{empcovtilde} \tilde{\gamma}_{k} = \frac{1}{n} \sum_{j=1}^{n-|k|} \hat{\varepsilon}_{j} \hat{\varepsilon}_{j+|k|}. \tag{5}\] For a well-chosen kernel \(K\) and under mild assumptions on the design and the error process, it has been proved in (Caron 2019) that \[\label{Slut} {\widetilde C_n}^{-1/2} D(n)(\hat{\beta} - \beta) \xrightarrow[n \rightarrow \infty]{\mathcal{L}} \mathcal{N}_p (0_{p}, I_p), \tag{6}\] for the plug-in estimator \(\widetilde C_n := \widehat C (\widetilde{\Gamma}_{n,h_n} )\), for some suitable sequence of bandwidths \((h_n)\).
More generally, in this paper, we say that an estimator \(\widehat{\Gamma}_{n}\) of \(\Gamma_{n}\) is consistent for estimating the covariance matrix \(C\) if \(\widehat C (\widehat{\Gamma}_{n})\) is positive definite and if it converges in probability to \(C\). Note that such a property requires assumptions on the design, see (Caron 2019). If \(\widehat C (\widehat{\Gamma}_{n})\) is consistent for estimating the covariance matrix \(C\), then \({\widehat C(\widehat{\Gamma}_n)}^{-1/2} D(n)(\hat{\beta} - \beta)\) converges in distribution to a standard Gaussian vector.
To conclude this section, let us make some additional remarks. The interest of Caron’s recent paper is that the consistency of the estimator \(\widehat{C} (\widehat{\Gamma}_{n})\) is proved under Hannan’s condition on the error process, which is known to be optimal with respect to the convergence in distribution (see for instance (Dedecker 2015)), and which allows dealing with most short memory processes. However, the natural estimator of the covariance matrix of \(\hat{\beta}\) based on \(\widehat{\Gamma}_{n}\) has been studied by many other authors in various contexts. For instance, let us mention the important line of research initiated by (Newey and West 1987, 1994) and the related papers by (Andrews 1991), (Andrews and Monahan 1992), among others. In the paper by (Andrews 1991), the consistency of the estimator based on \(\widehat{\Gamma}_{n}\) is proved under general conditions on the fourth-order cumulants of the error process, and a data-driven choice of the bandwidth is proposed. Note that these authors also considered the case of heteroskedastic processes. Most of these procedures, known as HAC (Heteroskedasticity and Autocorrelation Consistent) procedures, are implemented in the package sandwich by Zeileis, Lumley, Berger and Graham, and presented in great detail in the paper by (Zeileis 2004). We shall use an argument of the sandwich package, based on the data-driven procedure described by (Andrews 1991).
We now present tests and confidence regions for arbitrary estimators \(\widehat{\Gamma}_{n}\). The complete justifications are available for kernel estimators, see (Caron 2019).
We introduce the following univariate statistics: \[\label{pseudoStudent} Z_{j} = \frac{d_{j}(n) \hat{\beta}_{j}}{\sqrt{\widehat C_{(j,j)}}}, \tag{7}\] where \(\widehat C = \widehat C (\widehat{\Gamma}_{n})\). If \(\widehat{\Gamma}_{n}\) is consistent for estimating the covariance matrix \(C\) and if \(\beta_j = 0\), the distribution of \(Z_{j}\) converges to a standard normal distribution when \(n\) tends to infinity. We directly derive an asymptotic hypothesis test for testing \(\beta_j = 0\) against \(\beta_j \neq 0\) as well as an asymptotic confidence interval for \(\beta_j\).
Let \(A\) be an \(n \times k\) matrix with \(\mathop{\mathrm{rank}}(A) = k\). Under (Hannan 1973)’s conditions, \(D(n)( A \hat{\beta} - A \beta)\) converges in distribution to a centered Gaussian distribution with covariance matrix \(A C A^{t}\). If \(\widehat{\Gamma}_{n}\) is consistent for estimating the covariance matrix \(C\), then \(A \widehat C (\widehat{\Gamma}_{n})\) converges in probability to \(AC\). The matrix \(A \widehat C (\widehat{\Gamma}_{n}) A^{t}\) being symmetric positive definite, this yields \[W := (A \widehat C (\widehat{\Gamma}_{n}) )^{-1/2} D(n) A (\hat{\beta} - \beta) \xrightarrow[n \rightarrow \infty]{\mathcal{L}} \mathcal{N}_k (0_{k}, I_k) .\] This last result provides asymptotical confidence regions for the vector \(A \beta\). It also provides an asymptotic test for testing the hypothesis \(H_0\) : \(A \beta = 0\) against \(H_1\) : \(A \beta \neq 0\). Indeed, under the \(H_{0}\)-hypothesis, the distribution of \(\|W\|_2^2\) converges to a \(\chi^{2}(k)\)-distribution.
The test can be used to simplify a linear model by testing that several linear combinations between the parameters \(\beta_j\) are zero, as we usually do for Anova and regression models. In particular, with \(A = I_p\), the test corresponds to the test of overall significance.
Using the slm package is very intuitive because the arguments and the
outputs of slm
are similar to those of the standard functions lm
,
glm
, etc. The output of the main function slm
is an object of class
"slm", a specific class that has been defined for linear regression
with stationary processes. The "slm" class has methods plot
,
summary
, confint
, and predict
, see the extended version
(Caron et al. 2019) for more details. Moreover, the class "slm"
inherits from the "lm" class and thus provides the output of the
classical lm
function.
The statistical tools available in slm
strongly depend on the choice
of the covariance plug-in estimator \(\widehat C (\widehat \Gamma_n)\) we
use for estimating \(C\). All the estimators \(\widehat \Gamma_n\) proposed
in slm
are residual-based estimators, but they rely on different
approaches. In this section, we present the main functionality of slm
together with the different covariance plug-in estimators.
For illustrating the package, we simulate synthetic data according to
the linear model:
\[Y_{i} = \beta_{1} + \beta_{2} (\log(i) + \sin(i) + Z_{i}) + \beta_{3} i + \varepsilon_{i},\]
where \(Z\) is a Gaussian autoregressive process of order \(1\) and
\(\varepsilon\) is the Nonmixing process described further in the paper.
We use the functions generative_model
and generative_process
respectively to simulate observations according to this regression
design and with this specific stationary process.
> library(slm)
R> set.seed(42)
R> n = 500
R> eps = generative_process(n,"Nonmixing")
R> design = generative_model(n,"mod2")
R> design_sim = cbind(rep(1,n), as.matrix(design))
R> beta_vec = c(2,0.001,0.5)
R> Y = design_sim %*% beta_vec + eps R
A large class of stationary processes with continuous spectral density
can be well approximated by AR processes, see for instance Corollary
4.4.2 in (Brockwell and Davis 1991). The covariance structure of an AR process
having a closed form, it is thus easy to derive an approximation
\(\widetilde{\Gamma}_{AR(p)}\) of \(\Gamma_n\) by fitting an AR process on
the residual process. The AR-based method for estimating \(C\) is the
default version of slm
. This method proceeds in four main steps:
Fit an autoregressive process on the residual process \(\hat{\varepsilon}\) ;
Compute the theoretical covariances of the fitted AR process ;
Plug the covariances in the Toeplitz matrix \(\widetilde{\Gamma}_{AR(p)}\) ;
Compute \(\widehat{C} = \widehat{C}(\widetilde{\Gamma}_{AR(p)})\).
The slm
function fits a linear regression of the vector \(Y\) on the
design \(X\) and then fits an AR process on the residual process using the
ar
function from the
stats package. The output
of the slm
function is an object of class "slm". The order \(p\) of
the AR process is set in the argument model_selec
:
> regslm = slm(Y ~ X1 + X2, data = design, method_cov_st = "fitAR",
R+ model_selec = 3)
The estimated covariance is recorded as a vector in the attribute
cov_st
of regslm
, which is an object of class "slm". The estimated
covariance matrix can be computed by taking the Toeplitz matrix of
cov_st
, using the toeplitz
function.
The order \(p\) of the AR process can be chosen at hand by setting
model_selec = p
, or automatically with the AIC criterion by setting
model_selec = -1
.
> regslm = slm(Y ~ X1 + X2, data = design, method_cov_st = "fitAR",
R+ model_selec = -1)
The order of the fitted AR process is recorded in the model_selec
attribute of regslm
:
> regslm@model_selec R
1] 2 [
Here, the AIC criterion suggests to fit an AR(2) process on the residuals.
The second method for estimating the covariance matrix \(C\) is the kernel estimation method (4) studied in (Caron 2019). In short, this method estimates \(C\) via a smooth approximation of the covariance matrix \(\Gamma_{n}\) of the residuals. This estimation of \(\Gamma_{n}\) corresponds to the so-called tapered covariance matrix estimator in the literature, see for instance (Xiao and Wu 2012), or also to the "lag-window estimator" defined in (Brockwell and Davis 1991), page \(330\). It applies in particular for non-negative symmetric kernels with compact support, with an integrable Fourier transform and such that \(K(0) = 1\). Table 1 gives the list of the available kernels in the package slm.
kernel_fonc = |
kernel definition |
---|---|
rectangular |
\(K(x) = \mathbb{1}_{\{ |x| \leq 1 \}}\) |
triangle |
\(K(x) = (1 - | x |) \mathbb{1}_{\{ |x| \leq 1 \}}\) |
trapeze |
\(K(x) = \mathbb{1}_{\{ |x| \leq \delta \}} + \frac{1}{1-\delta}(1 - | x |) \mathbb{1}_{\{\delta \leq |x| \leq 1 \}}\) |
It is also possible for the user to define his own kernel and use it in
the argument kernel_fonc
of the slm
function. Below we use the
triangle kernel, which assures that the covariance matrix is positive
definite. The support of the kernel \(K\) in
Equation (4) being compact, only the terms
\(\tilde \gamma_{j-l}\) for small enough lag \(j-l\) are kept and weighted
by the kernel in the expression of \(\widetilde{\Gamma}_{n,h}\). Rather
than setting the bandwidth \(h\), we select the number of \(\gamma(k)\)’s
that should be kept (the lag) with the argument model_selec
in the
slm
function. Then the bandwidth \(h\) is calibrated accordingly, that
is equal to model_selec
\(+ 1\).
> regslm = slm(Y ~ X1 + X2, data = design, method_cov_st = "kernel",
R+ model_selec = 5, kernel_fonc = triangle, plot = TRUE)
The plot output by the slm
function is given in
Figure 1.
The order parameter can be chosen at hand as before or automatically by
setting model_selec = -1
. The automatic order selection is based on
the bootstrap procedure proposed by (Wu and Pourahmadi 2009) for banded
covariance matrix estimation. The block_size
argument sets the size of
bootstrap blocks, and the block_n
argument sets the number of blocks.
The final order is chosen by taking the order which has the minimal
risk. Figure 2 gives the plots of the estimated risk
for the estimation of \(\Gamma_{n}\) (left) and the final estimated ACF
(right).
> regslm = slm(Y ~ X1 + X2, data = design, method_cov_st ="kernel",
R+ model_selec = -1, kernel_fonc = triangle, model_max = 30,
+ block_size = 100, block_n = 100, plot = TRUE)
|
|
slm
for the kernel
method with bootstrap selection of the order.
The selected order is recorded in the model_selec
attribute of the
slm
object output by the slm
function:
> regslm@model_selec R
1] 10 [
An alternative method for choosing the bandwidth in the case of the rectangular kernel has been proposed in (Efromovich 1998). For a large class of stationary processes with exponentially decaying autocovariance function (mainly the ARMA processes), Efromovich proved that the rectangular kernel is asymptotically minimax, and he proposed the following estimator: \[\hat{f}_{J_{nr}}(\lambda) = \frac{1}{2 \pi} \sum_{k=-J_{nr}}^{k=J_{nr}} \hat{\gamma}_{k} e^{i k \lambda},\] with the lag \[J_{nr} = \frac{\log(n)}{2r} \left[ 1 + (\log(n))^{-1/2} \right],\] where \(r\) is a regularity index of the autocovariance index. In practice, this parameter is unknown and is estimated thanks to the algorithm proposed in the section \(4\) of (Efromovich 1998). As for the other methods, we use the residual based empirical covariances \(\tilde{\gamma}_{k}\) to compute \(\hat{f}_{J_{nr}}(\lambda)\).
> regslm = slm(Y ~ X1 + X2, data = design, method_cov_st = "efromovich",
R+ model_selec = -1)
Another method for choosing the bandwidth has been proposed by (Andrews 1991) and implemented in the package sandwich by Zeileis, Lumley, Berger and Graham (see the paper by (Zeileis 2004)). For the slm package, the automatic choice of the bandwidth proposed by Andrews can be obtained as follows:
> regslm = slm(Y ~ X1 + X2, data = design, method_cov_st = "hac") R
The procedure is based on the function kernHAC
in the sandwich
package. This function computes directly the covariance matrix estimator
of \(\hat{\beta}\), which will be recorded in the slot Cov_ST
of the
slm
function. Here, we take the quadratic spectral kernel:
\[K \left( x \right) = \frac{25}{12 \pi^{2} x^{2}} \left( \frac{\sin \left( 6 \pi x / 5 \right) }{6 \pi x / 5} - \cos \left( 6 \pi x / 5 \right) \right),\]
as suggested by Andrews (see Section \(2\) in
(Andrews 1991), or Section \(3.2\) in
(Zeileis 2004)), but other kernels could be used, such as
Bartlett, Parzen, Tukey-Hamming, among others (see
(Zeileis 2004)).
Depending on the method used, the matrix
\(\widehat{C} (\widehat{\Gamma}_{n})\) may not always be positive
definite. It is the case of the kernel method with rectangular or
trapeze kernel. To overcome this problem, we make the projection of
\(\widehat{C} (\widehat{\Gamma}_{n})\) into the cone of positive definite
matrices by applying a hard thresholding on the spectrum of this matrix:
we replace all eigenvalues lower or equal to zero with the smallest
positive eigenvalue of \(\widehat{C} (\widehat{\Gamma}_{n})\). Note that
this projection is useless for the triangle or quadratic spectral
kernels because their Fourier transform is non-negative (leading to a
positive definite matrix \(\widehat{C} (\widehat{\Gamma}_{n})\)). Of
course, it is also useless for the fitAR
and spectralproj
methods.
The projection method relies on the ideas of (Comte 2001), where
an adaptive nonparametric method has been proposed for estimating the
spectral density of a stationary Gaussian process. We use the residual
process as a proxy for the error process, and we compute the projection
coefficients with the residual-based empirical covariance coefficients
\(\tilde{\gamma}_{k}\), see Equation (5). For some
\(d \in \mathbb N^*\), the estimator of the spectral density of the error
process that we use is defined by computing the projection estimators
for the residual process on the basis of histogram functions:
\[\phi_{j}^{(d)} = \sqrt{\frac{d}{\pi}} \mathbb{1}_{[\pi j/d, \pi (j+1)/d[}, \qquad \ j = 0, 1, \ldots, d-1.\]
The estimator is defined by
\[\hat{f}_{d}(\lambda) = \sum_{j=0}^{d-1} \hat{a}_{j}^{(d)} \phi_{j}^{(d)},\]
where the projection coefficients are
\[\hat{a}_{j}^{(d)} = \sqrt{\frac{d}{\pi}} \left( \frac{\tilde{\gamma}_{0}}{2d} + \frac{1}{\pi} \sum_{r=1}^{n-1} \frac{\tilde{\gamma}_{r}}{r} \left[ \sin \left( \frac{\pi (j+1) r}{d} \right) - \sin \left( \frac{\pi j r}{d} \right) \right] \right).\]
The Fourier coefficients of the spectral density are equal to the
covariance coefficients. Thus, for \(k = 1, \ldots, n-1\) it yields
\[\begin{aligned}
\gamma_k &= & c_{k} \\
&= & \frac{2}{k} \sqrt{\frac{d}{\pi}} \sum_{j=0}^{d-1} \hat{a}_{j}^{(d)} \left[ \sin \left( \frac{k \pi (j+1)}{d} \right) - \sin \left( \frac{k \pi j}{d} \right) \right],
\end{aligned}\]
and for \(k=0\):
\[\gamma_0 = c_{0} = 2 \sqrt{\frac{\pi}{d}} \sum_{j=0}^{d-1} \hat{a}_{j}^{(d)}.\]
This method can be proceeded in the slm
function by setting
method_cov_st =
"spectralproj"
:
> regslm = slm(Y ~ X1 + X2, data = design, method_cov_st = "spectralproj",
R+ model_selec = 10, plot = TRUE)
The graph of the estimated spectral density can be plotted by setting
plot = TRUE
in the slm
function, see Figure 3.
The Gaussian model selection method proposed in (Comte 2001) follows the ideas of Birgé and Massart, see for instance (Massart 2007). It consists of minimizing the \(l_2\) penalized criterion, see Section 5 in (Comte 2001): \[crit(d) := - \sum_{j=0} ^{d-1} \left[ \hat{a}_{j}^{(d)} \right] ^2 + c \frac dn,\] where \(c\) is a multiplicative constant that in practice can be calibrated using the slope heuristic method, see (Birgé and Massart 2007), (Baudry et al. 2012) and the R package capushe.
> regslm = slm(Y ~ X1 + X2, data = design, method_cov_st = "spectralproj",
R+ model_selec = -1, model_max = 50, plot = TRUE)
The selected dimension is recorded in the model_selec
attribute of the
slm
object output by the slm
function:
> regslm@model_selec R
1] 8 [
The slope heuristic algorithm here selects a Histogram on a regular partition of size \(8\) over the interval \([0, \pi]\) to estimate the spectral density.
This method is a full-manual method for estimating the covariance matrix \(C\) by only selecting covariance terms from the residual covariances \(\tilde{\gamma}_{k}\) defined by (5). Let \(I\) be a set of positive integers, then we consider \[\hat \gamma_I(k) := \tilde{\gamma}_{k} {\mathbf{\mathbb{1}}}_{k \in I\cup \{0\} }, \hskip 1cm 0 \leq |k| \leq n-1,\] and then we define the estimated covariance marix \(\widehat{\Gamma}_{I}\) by taking the Toeplitz matrix of the vector \(\hat \gamma_I\). This estimator is a particular example of a masked sample covariance estimator, as introduced by (Chen et al. 2012), see also (Levina and Vershynin 2012). Finally, we derive from \(\widehat \Gamma_I\) an estimator \(\widehat C (\widehat \Gamma_I)\) for \(C\).
The next instruction selects the coefficients \(0\), \(1\), \(2\) and \(4\) from the residual covariance terms:
> regslm = slm(Y ~ X1 + X2, data = design, method_cov_st = "select",
R+ model_selec = c(1,2,4))
The positive lags of the selected covariances are recorded in the
model_selec
argument. Let us notice that the variance \(\gamma_{0}\) is
automatically selected.
As for the kernel method, the resulting covariance matrix may not be positive definite. If it is the case, the positive definite projection method described before is used.
This last method is a direct plug-in method. The user proposes his own vector estimator \(\hat{\gamma}\) of \(\gamma\), and then the Toeplitz matrix \(\widehat \Gamma_{n}\) of the vector \(\hat \gamma\) is used for estimating \(C\) with \(\widehat C (\widehat \Gamma_{n})\).
> v = rep(0,n)
R> v[1:10] = acf(eps, type = "covariance", lag.max = 9)$acf
R> regslm = slm(Y ~ X1 + X2, data = design, cov_st = v) R
The user can also propose his own covariance matrix \(\widehat \Gamma_{n}\) for estimating \(C\).
> v = rep(0,n)
R> v[1:10] = acf(eps, type = "covariance", lag.max = 9)$acf
R> V = toeplitz(v)
R> regslm = slm(Y ~ X1 + X2, data = design, Cov_ST = V) R
Let us notice that the user must verify that the resulting covariance matrix is positive definite. The positive definite projection algorithm is not used with this method.
This section summarizes an extensive study which has been carried out to compare the performances of the different approaches presented before in the context of a linear model with short range dependent stationary errors.
We first present the five generative models for the errors that we consider in the paper. We choose different kinds of processes to reflect the diversity of short-memory processes.
AR1 process. The AR1 process is a Gaussian AR(1) process defined by \[\varepsilon_{i} - 0.7 \varepsilon_{i-1} = W_{i},\] where \(W_{i}\) is a standard gaussian distribution \(\mathcal{N}(0,1)\).
AR12 process. The AR12 process is a seasonal AR(12) process defined by \[\varepsilon_{i} - 0.5 \varepsilon_{i-1} - 0.2 \varepsilon_{i-12} = W_{i},\] where \(W_{i}\) is a standard Gaussian distribution \(\mathcal{N}(0,1)\). When studying monthly datasets, one usually observes a seasonality of order \(12\). For example, when looking at climate data, the data are often collected per month, and the same phenomenon tends to repeat every year. Even if the design integrates the deterministic part of the seasonality, a correlation of order \(12\) usually remains present in the residual process.
MA12 process. The MA12 is also a seasonal process defined by \[\varepsilon_{i} = W_{i} + 0.5 W_{i-2} + 0.3 W_{i-3} + 0.2 W_{i-12},\] where the \((W_{i})\)’s are i.i.d. random variables following Student’s distribution with \(10\) degrees of freedom.
Nonmixing process. The three processes described above are basic ARMA processes, whose innovations have absolutely continuous distributions; in particular, they are strongly mixing in the sense of (Rosenblatt 1956), with a geometric decay of the mixing coefficients (in fact, the MA12 process is even \(12\)-dependent, which means that the mixing coefficient \(\alpha(k) = 0\) if \(k > 12\)). Let us now describe a more complicated process: let \((Z_{1}, \ldots, Z_{n})\) satisfying the AR(1) equation \[Z_{i+1} = \frac{1}{2} (Z_{i} + \eta_{i+1}),\] where \(Z_{1}\) is uniformly distributed over \([0,1]\) and the \(\eta_{i}\)’s are i.i.d. random variables with distribution \(\mathcal{B}(1/2)\), independent of \(Z_{1}\). The process \((Z_{i})_{i \geq 1}\) is a strictly stationary Markov chain, but it is not \(\alpha\)-mixing in the sense of Rosenblatt (see (Bradley 1986)). Let now \(Q_{0,\sigma^{2}}\) be the inverse of the cumulative distribution function of a centered Gaussian distribution with variance \(\sigma^{2}\) (for the simulations below, we choose \(\sigma^{2} = 25\)). The Nonmixing process is then defined by \[\varepsilon_{i} = Q_{0,\sigma^{2}}(Z_{i}).\] The sequence \((\varepsilon_{i})_{i \geq 1}\) is also a stationary Markov chain (as an invertible function of a stationary Markov chain). By construction, \(\varepsilon_{i}\) is \(\mathcal{N}(0, \sigma^{2})\)-distributed, but the sequence \((\varepsilon_{i})_{i \geq 1}\) is not a Gaussian process (otherwise, it would be mixing in the sense of Rosenblatt). Although it is not obvious, one can prove that the process \((\varepsilon_{i})_{i \geq 1}\) satisfies Hannan’s condition (see (Caron 2019), Section \(4.2\)).
Sysdyn process. The four processes described above have the property of "geometric decay of correlations", which means that the \(\gamma(k)\)’s tend to \(0\) at an exponential rate. However, as already pointed out in the introduction, Hannan’s condition is valid for most of the short memory processes, even for processes with slow decay of correlations (provided that the \(\gamma(k)\)’s are summable). Hence, our last example will be a non-mixing process (in the sense of Rosenblatt), with an arithmetic decay of the correlations. For \(\gamma \in ]0,1[\), the intermittent map \(\theta_{\gamma} : [0,1] \mapsto [0,1]\) introduced in (Liverani et al. 1999) is defined by \[\theta_{\gamma}(x) = \left\{ \begin{array}{r c l} x(1 + 2^{\gamma} x^{\gamma}) \qquad &\text{if}& \ x \in [0, 1/2[ \\ 2x - 1 \qquad &\text{if}& \ x \in [1/2, 1].\\ \end{array} \right.\]
It follows from (Liverani et al. 1999) that there exists a unique \(\theta_{\gamma}\)-invariant probability measure \(\nu_{\gamma}\). The Sysdyn process is then defined by \[\varepsilon_{i} = \theta_{\gamma}^{i}.\] From (Liverani et al. 1999), we know that on the probability space \(([0,1], \nu_{\gamma})\), the autocorrelations \(\gamma(k)\) of the stationary process \((\varepsilon_{i})_{i \geq 1}\) are exactly of order \(k^{-(1-\gamma)/\gamma}\). Hence, \((\varepsilon_{i})_{i \geq 1}\) is a short memory process provided \(\gamma \in ]0, 1/2[\). Moreover, it has been proved in Section \(4.4\) of (Caron and Dede 2018) that \((\varepsilon_{i})_{i \geq 1}\) satisfies Hannan’s condition in the whole short-memory range, that is for \(\gamma \in ]0, 1/2[\). For the simulations below, we took \(\gamma = 1/4\), which give autocorrelations \(\gamma(k)\) of order \(k^{-3}\).
The linear regression models simulated in the experiments all have the
following form:
\[\label{reglinsimu}
Y_{i} = \beta_{1} + \beta_{2} (\log(i) + \sin(i) + Z_{i}) + \beta_{3} i + \varepsilon_{i}, \hskip 1cm \textrm{for all i in } \{1, \ldots, n\}, \tag{8}\]
where \(Z\) is a Gaussian autoregressive process of order \(1\) and
\(\varepsilon\) is one of the stationary processes defined above. For the
simulations, \(\beta_{1}\) is always equal to \(3\). All the error processes
presented above can be simulated with the slm package with the
generative_process
function. The design can be simulated with the
generative_model
function.
It is, of course, of first importance to provide hypothesis tests with
correct significance levels or at least with correct asymptotical
significance levels, which is possible if the estimator
\(\widehat \Gamma_{n}\) of the covariance matrix \(\Gamma_{n}\) is
consistent for estimating \(C\). For instance, the results of (Caron 2019)
show that it is possible to construct statistical tests with correct
asymptotical significance levels. However, in practice, such
asymptotical results are not sufficient since they do not indicate how
to tune the bandwidth on a given dataset. This situation makes the
practice of linear regression with dependent errors really more
difficult than linear regression with i.i.d. errors. This problem
happens for several methods given before ; order choice for the fitAR
method, bandwidth choice for the kernel
method, dimension selection
for the spectralproj
method.
It is a tricky issue to design a data-driven procedure for choosing test
parameters in order to have a correct Type I Error. Note that unlike
with supervised problems and density estimation, it is not possible to
calibrate hypothesis tests in practice using cross-validation
approaches. We thus propose to calibrate the tests using well-founded
statistical procedures for risk minimization ; AIC criterion for the
fitAR
method, bootstrap procedures for the kernel
method, and slope
heuristics for the spectralproj
method. These procedures are
implemented in the slm
function with the model_selec = -1
argument,
as detailed in the previous section.
Let us first illustrate the calibration problem with the AR12 process.
For \(T=1000\) simulations, we generate an error process of size \(n\) under
the null hypothesis: \(H_{0}: \beta_{2} = \beta_{3} = 0\). Then we use the
fitAR
method of the slm
function with orders between \(1\) and \(50\),
and we perform the model significance test. The procedure is repeated
\(1000\) times, and we estimate the true level of the test by taking the
average of the estimated levels on the \(1000\) simulations for each
order. The results are given in Figure 4 for \(n=1000\). A
boxplot is also displayed to visualize the distribution of the order
selected by the automatic criterion (AIC).
We first study the case of non-Seasonal error processes. We simulate an
\(n\)-error process according to the AR1, the Nonmixing, or the Sysdyn
processes. We simulate realizations of the linear regression model
(8) under the null hypothesis:
\(H_{0}: \beta_{2} = \beta_{3} = 0\). We use the automatic selection
procedures for each method (model_selec = -1
). The simulations are
repeated \(1000\) times in order to estimate the true level of the model
significance for each test procedure. We simulate either small samples
(\(n=200\)) or larger samples (\(n=1000, 2000, 5000\)). The results of these
experiments are summarized in Table 2.
n | Process Method | Fisher test | fitAR | spectralproj |
200 | AR1 process | 0.465 | 0.097 | 0.14 |
NonMixing | 0.298 | 0.082 | 0.103 | |
Sysdyn process | 0.385 | 0.105 | 0.118 | |
1000 | AR1 process | 0.418 | 0.043 | 0.049 |
NonMixing | 0.298 | 0.046 | 0.05 | |
Sysdyn process | 0.393 | 0.073 | 0.077 | |
2000 | AR1 process | 0.454 | 0.071 | 0.078 |
NonMixing | 0.313 | 0.051 | 0.053 | |
Sysdyn process | 0.355 | 0.063 | 0.064 | |
5000 | AR1 process | 0.439 | 0.044 | 0.047 |
NonMixing | 0.315 | 0.053 | 0.056 | |
Sysdyn process | 0.381 | 0.058 | 0.061 |
n | Process Method | efromovich | kernel | hac |
200 | AR1 process | 0.135 | 0.149 | 0.108 |
NonMixing | 0.096 | 0.125 | 0.064 | |
Sysdyn process | 0.124 | 0.162 | 0.12 | |
1000 | AR1 process | 0.049 | 0.086 | 0.049 |
NonMixing | 0.053 | 0.076 | 0.038 | |
Sysdyn process | 0.079 | 0.074 | 0.078 | |
2000 | AR1 process | 0.075 | 0.067 | 0.071 |
NonMixing | 0.057 | 0.067 | 0.047 | |
Sysdyn process | 0.066 | 0.069 | 0.073 | |
5000 | AR1 process | 0.047 | 0.047 | 0.044 |
NonMixing | 0.059 | 0.068 | 0.05 | |
Sysdyn process | 0.057 | 0.064 | 0.071 |
For \(n\) large enough (\(n \geq 1000\)), all methods work well, and the
estimated level is around \(0.05\). However, for small samples
(\(n = 200\)), we observe that the fitAR
and the hac
methods show
better performances than the others. The kernel
method is slightly
less effective. With this method, we must choose the size of the
bootstrap blocks as well as the number of blocks, and the test results
are really sensitive to these parameters. In these simulations, we have
chosen \(100\) blocks with a size of \(n/2\). The results are expected to
improve with a larger number of blocks.
Let us notice that for all methods and for all sample sizes, the estimated level is much better than if no correction is made (usual Fisher tests).
We now study the case of linear regression with seasonal errors. The experiment is exactly the same as before, except that we simulate AR12 or MA12 processes. The results of these experiments are summarized in Table 3.
n | Process Method | Fisher test | fitAR | spectralproj |
200 | AR12 process | 0.436 | 0.178 | 0.203 |
MA12 process | 0.228 | 0.113 | 0.113 | |
1000 | AR12 process | 0.468 | 0.068 | 0.183 |
MA12 process | 0.209 | 0.064 | 0.066 | |
2000 | AR12 process | 0.507 | 0.071 | 0.196 |
MA12 process | 0.237 | 0.064 | 0.064 | |
5000 | AR12 process | 0.47 | 0.062 | 0.183 |
MA12 process | 0.242 | 0.044 | 0.048 |
n | Process Method | efromovich | kernel | hac |
200 | AR12 process | 0.223 | 0.234 | 0.169 |
MA12 process | 0.116 | 0.15 | 0.222 | |
1000 | AR12 process | 0.181 | 0.124 | 0.179 |
MA12 process | 0.069 | 0.063 | 0.18 | |
2000 | AR12 process | 0.153 | 0.104 | 0.192 |
MA12 process | 0.058 | 0.068 | 0.173 | |
5000 | AR12 process | 0.1 | 0.091 | 0.171 |
MA12 process | 0.043 | 0.057 | 0.147 |
We directly see that the case of seasonal processes is more complicated
than for the non-seasonal processes especially for the AR12 process. For
a small samples size, the estimated level is between \(0.17\) and \(0.24\),
which is clearly too large. It is, however, much better than the
estimated level of the usual Fisher test, which is around \(0.45\). The
fitAR
method is the best method here for the AR12 process because for
\(n \geq 1000\), the estimated level is between \(0.06\) and \(0.07\). For
efromovich
and kernel
methods, a level less than \(0.10\) is reached
but for large samples only. The spectralproj
and hac
methods do not
seem to work well for the AR12 process, although they remain much better
than the usual Fisher tests (around \(19\%\) of rejection instead of
\(45\%\)).
The case of the MA12 process seems easier to deal with. For \(n\) large
enough (\(n \geq 1000\)), the estimated level is between \(0.04\) and \(0.07\)
whatever the method, except for hac
(around \(0.15\) for \(n = 5000\)). It
is less effective for a small sample size (\(n=200\)) with an estimated
level around \(0.115\) for fitAR
, spectralproj
and efromovich
methods.
To be complete, we consider the case where the \(\epsilon_{i}\)’s are i.i.d., to see how the five automatic methods perform in that case. We simulate \(n\) i.i.d. centered random variables according to the formula: \[\epsilon_{i} = W_{i}^{2} - \frac{5}{4},\] where \(W\) follows a student distribution with \(10\) degrees of freedom. Note that the distribution of the \(\epsilon_{i}\)’s is not symmetric and has no exponential moments.
n | Process Method | Fisher test | fitAR | spectralproj |
150 | i.i.d. process | 0.053 | 0.068 | 0.078 |
300 | i.i.d. process | 0.052 | 0.051 | 0.06 |
500 | i.i.d. process | 0.047 | 0.049 | 0.053 |
n | Process Method | efromovich | kernel | hac |
150 | i.i.d. process | 0.061 | 0.124 | 0.063 |
300 | i.i.d. process | 0.05 | 0.095 | 0.052 |
500 | i.i.d. process | 0.049 | 0.082 | 0.056 |
Except for the kernel
method, the estimated levels are close to \(5 \%\)
for \(n\) large enough (\(n \geq 300\)). It is slightly worse for small
samples, but it remains quite good for the methods fitAR
,
efromovich
, and hac
.
As a general conclusion of this section about numerical experiments and
method comparison, we see that the fitAR
method performs quite well in
a wide variety of situations and should therefore be used as soon as the
user suspects that the error process can be modeled by a stationary
short-memory process.
This dataset comes from a study about fine particle pollution in five Chinese cities. The data are available on the following website https://archive.ics.uci.edu/ml/datasets/PM2.5+Data+of+Five+Chinese+Cities#. Here we are interested with the city of Shanghai. We study the regression of PM2.5 pollution in Xuhui District by other measurements of pollution in neighboring districts and also by meteorological variables. The dataset contains hourly observations between January 2010 and December 2015. More precisely, it contains \(52584\) records of \(17\) variables: date, time of measurement, pollution and meteorological variables. More information on these data is available in the paper of (Liang et al. 2016).
We remove the lines that contain NA
observations, and we then extract
the first \(5000\) observations. For simplicity, we will only consider
pollution variables and weather variables. We start the study with the
following \(10\) variables:
PM_Xuhui
: PM2.5 concentration in the Xuhui district (\(ug/m^{3}\))
PM_Jingan
: PM2.5 concentration in the Jing’an district
(\(ug/m^{3}\))
PM_US.Post
: PM2.5 concentration in the U.S diplomatic post
(\(ug/m^{3}\))
DEWP
: Dew Point (Celsius Degree)
TEMP
: Temperature (Celsius Degree)
HUMI
: Humidity (\(\%\))
PRES
: Pressure (hPa)
Iws
: Cumulated wind speed (\(m/s\))
precipitation
: hourly precipitation (mm)
Iprec
: Cumulated precipitation (mm)
> shan = read.csv("ShanghaiPM20100101_20151231.csv", header = TRUE,
R+ sep = ",")
> shan = na.omit(shan)
R> shan_complete = shan[1:5000,c(7,8,9,10,11,12,13,15,16,17)]
R> shan_complete[1:5,] R
PM_Jingan PM_US.Post PM_Xuhui DEWP HUMI PRES TEMP Iws26305 66 70 71 -5 69.00 1023 0 60
26306 67 76 72 -5 69.00 1023 0 62
26308 73 78 74 -4 74.41 1023 0 65
26309 75 77 77 -4 80.04 1023 -1 68
26310 73 78 80 -4 80.04 1023 -1 70
precipitation Iprec26305 0 0
26306 0 0
26308 0 0
26309 0 0
26310 0 0
The aim is to study the concentration of particles in Xuhui District
according to the other variables. We first fit a linear regression with
the lm
function.
> reglm = lm(shan_complete$PM_Xuhui ~ . ,data = shan_complete)
R> summary.lm(reglm) R
:
Calllm(formula = shan_complete$PM_Xuhui ~ ., data = shan_complete)
:
Residuals1Q Median 3Q Max
Min -132.139 -4.256 -0.195 4.279 176.450
:
CoefficientsPr(>|t|)
Estimate Std. Error t value -54.859483 40.975948 -1.339 0.180690
(Intercept) 0.596490 0.014024 42.533 < 2e-16 ***
PM_Jingan 0.375636 0.015492 24.246 < 2e-16 ***
PM_US.Post -1.038941 0.170144 -6.106 1.10e-09 ***
DEWP 0.291713 0.045799 6.369 2.07e-10 ***
HUMI 0.025287 0.038915 0.650 0.515852
PRES 1.305543 0.168754 7.736 1.23e-14 ***
TEMP -0.007650 0.002027 -3.774 0.000163 ***
Iws 0.462885 0.132139 3.503 0.000464 ***
precipitation -0.125456 0.039025 -3.215 0.001314 **
Iprec ---
: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Signif. codes
: 10.68 on 4990 degrees of freedom
Residual standard error-squared: 0.9409, Adjusted R-squared: 0.9408
Multiple R-statistic: 8828 on 9 and 4990 DF, p-value: < 2.2e-16 F
The variable PRES
has no significant effect on the PM_Xuhui
variable. We then perform a backward selection procedure, which leads to
select \(9\) significant variables:
> shan_lm = shan[1:5000,c(7,8,9,10,11,13,15,16,17)]
R> reglm = lm(shan_lm$PM_Xuhui ~ . ,data = shan_lm)
R> summary.lm(reglm) R
:
Calllm(formula = shan_lm$PM_Xuhui ~ ., data = shan_lm)
:
Residuals1Q Median 3Q Max
Min -132.122 -4.265 -0.168 4.283 176.560
:
CoefficientsPr(>|t|)
Estimate Std. Error t value -28.365506 4.077590 -6.956 3.94e-12 ***
(Intercept) 0.595564 0.013951 42.690 < 2e-16 ***
PM_Jingan 0.376486 0.015436 24.390 < 2e-16 ***
PM_US.Post -1.029188 0.169471 -6.073 1.35e-09 ***
DEWP 0.285759 0.044870 6.369 2.08e-10 ***
HUMI 1.275880 0.162453 7.854 4.90e-15 ***
TEMP -0.007734 0.002023 -3.824 0.000133 ***
Iws 0.462137 0.132127 3.498 0.000473 ***
precipitation -0.127162 0.038934 -3.266 0.001098 **
Iprec ---
: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Signif. codes
: 10.68 on 4991 degrees of freedom
Residual standard error-squared: 0.9409, Adjusted R-squared: 0.9408
Multiple R-statistic: 9933 on 8 and 4991 DF, p-value: < 2.2e-16 F
The autocorrelation of the residual process shows that the errors are
clearly not i.i.d., see Figure 5. We thus suspect the
lm
procedure to be unreliable in this context.
The autocorrelation function decreases pretty fast, and the partial
autocorrelation function suggests that fitting an AR process on the
residuals should be an appropriate method in this case. The automatic
fitAR
method of slm
selects an AR process of order \(28\). The
residuals of this AR fitting look like white noise, as shown in
Figure 6.
Consequently, we propose to perform a linear regression with
slm
function, using the fitAR
method on the complete model.
> regslm = slm(shan_complete$PM_Xuhui ~ . ,data = shan_complete,
R+ method_cov_st = "fitAR", model_selec = -1)
> summary(regslm) R
:
Call"slm(formula = myformula, data = data, x = x, y = y)"
:
Residuals1Q Median 3Q Max
Min -132.139 -4.256 -0.195 4.279 176.450
:
CoefficientsPr(>|z|)
Estimate Std. Error z value -54.859483 143.268399 -0.383 0.701783
(Intercept) 0.596490 0.028467 20.953 < 2e-16 ***
PM_Jingan 0.375636 0.030869 12.169 < 2e-16 ***
PM_US.Post -1.038941 0.335909 -3.093 0.001982 **
DEWP 0.291713 0.093122 3.133 0.001733 **
HUMI 0.025287 0.137533 0.184 0.854123
PRES 1.305543 0.340999 3.829 0.000129 ***
TEMP -0.007650 0.005698 -1.343 0.179399
Iws 0.462885 0.125641 3.684 0.000229 ***
precipitation -0.125456 0.064652 -1.940 0.052323 .
Iprec ---
: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Signif. codes
: 10.68
Residual standard error-squared: 0.9409
Multiple R-statistic: 8383 on 9 DF, p-value: < 2.2e-16 chi2
Note that the variables show globally larger p-values than with the
lm
procedure, and more variables have no significant effect than with
lm
. After performing a backward selection, we obtain the following
results:
> shan_slm = shan[1:5000,c(7,8,9,10,11,13)]
R> regslm = slm(shan_slm$PM_Xuhui ~ . , data = shan_slm,
R+ method_cov_st = "fitAR", model_selec = -1)
> summary(regslm) R
:
Call"slm(formula = myformula, data = data, x = x, y = y)"
:
Residuals1Q Median 3Q Max
Min -132.263 -4.341 -0.192 4.315 176.501
:
CoefficientsPr(>|z|)
Estimate Std. Error z value -29.44924 8.38036 -3.514 0.000441 ***
(Intercept) 0.60063 0.02911 20.636 < 2e-16 ***
PM_Jingan 0.37552 0.03172 11.840 < 2e-16 ***
PM_US.Post -1.05252 0.34131 -3.084 0.002044 **
DEWP 0.28890 0.09191 3.143 0.001671 **
HUMI 1.30069 0.32435 4.010 6.07e-05 ***
TEMP ---
: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Signif. codes
: 10.71
Residual standard error-squared: 0.9406
Multiple R-statistic: 8247 on 5 DF, p-value: < 2.2e-16 chi2
The backward selection with slm
only keeps \(5\) variables.
The authors are grateful to Anne Philippe (Nantes University) and Aymeric Stamm (CNRS - Nantes University) for valuable discussions.
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Caron, et al., "Linear Regression with Stationary Errors: the R Package slm", The R Journal, 2021
BibTeX citation
@article{RJ-2021-030, author = {Caron, Emmanuel and Dedecker, Jérôme and Michel, Bertrand}, title = {Linear Regression with Stationary Errors: the R Package slm}, journal = {The R Journal}, year = {2021}, note = {https://rjournal.github.io/}, volume = {13}, issue = {1}, issn = {2073-4859}, pages = {146-163} }