Nonparametric partitioning-based least squares regression is an important tool in empirical work. Common examples include regressions based on splines, wavelets, and piecewise polynomials. This article discusses the main methodological and numerical features of the R software package lspartition, which implements results for partitioning-based least squares (series) regression estimation and inference from Cattaneo and Farrell (2013) and Cattaneo et al. (2020). These results cover the multivariate regression function as well as its derivatives. First, the package provides data-driven methods to choose the number of partition knots optimally, according to integrated mean squared error, yielding optimal point estimation. Second, robust bias correction is implemented to combine this point estimator with valid inference. Third, the package provides estimates and inference for the unknown function both pointwise and uniformly in the conditioning variables. In particular, valid confidence bands are provided. Finally, an extension to two-sample analysis is developed, which can be used in treatment-control comparisons and related problems.
Nonparametric partitioning-based least squares regression estimation is an important method for estimating conditional expectation functions in statistics, economics, and other disciplines. These methods first partition the support of covariates and then construct a set of local basis functions on top of the partition to approximate the unknown regression function or its derivatives. Empirically popular basis functions include splines, compactly supported wavelets, and piecewise polynomials. For textbook reviews on classical and modern nonparametric regression methodology see, among others, Fan and Gijbels (2018), Györfi et al. (2002), Ruppert et al. (2003), and Harezlak et al. (2018). For a review on partitioning-based approximations in nonparametrics and machine learning see (Zhang and Singer 2010) and references therein.
This article gives a detailed discussion of the software package lspartition, available for R, which implements partitioning-based least squares regression estimation and inference. This package offers several features which improve on existing tools, leveraging the recent results of Cattaneo and Farrell (2013) and Cattaneo et al. (2020), and delivering data-driven methods to easily implement partitioning-based estimation and inference, including optimal tuning parameter choices and uniform inference results such as confidence bands. We cover splines, compactly supported wavelets, and piecewise polynomials, in a unified way, encompassing prior methods and routines previously unavailable without manual coding by researchers. Piecewise polynomials generally differ from splines and wavelets in that they do not enforce global smoothness over the partition, but in the special cases of zero-degree bases on a tensor-product partition, the three basis choices (i.e., zero-degree spline, Haar wavelet, and piecewise constant) are equivalent.
The first contribution offered by lspartition is a data-driven choice of the number of partitioning knots that is optimal in an integrated mean squared error (IMSE) sense. A major hurdle to practical implementation of any nonparametric estimator is tuning parameter choice, and by offering several feasible IMSE-optimal methods for splines, compactly supported wavelets, and piecewise polynomials, lspartition provides practitioners with tools to overcome this important implementation issue.
However, point estimation optimal tuning parameter choices yield invalid inference in general, and the IMSE-optimal choice is no exception. The second contribution of lspartition is the inclusion of robust bias correction methods, which allow for inference based on optimal point estimators. lspartition implements the three methods studied by Cattaneo et al. (2020), which are based on novel bias expansions therein. Both the bias and variance quantities are kept in pre-asymptotic form, yielding better bias correction and standard errors robust to conditional heteroskedasticity of unknown form. Collectively, this style of robust bias correction has been proven to yield improved inference in other nonparametric contexts (Calonico et al. 2018, 2020).
The third main contribution is valid inference, both pointwise and
uniformly in the support of the conditioning variables. When robust bias
correction is employed, this inference is valid for the IMSE-optimal
point estimator, allowing the researcher to combine an optimal partition
for point estimation and a “faithful” measure of uncertainty (i.e., one
that uses the same nonparametric estimation choices, here captured by
the partition). In particular, lspartition delivers valid confidence
bands that cover the entire regression function and its derivatives.
These data-driven confidence bands are constructed by approximating the
distribution of
Last but not least, the package also offers a convenient function to implement estimation and inference for linear combinations of regression estimators of different groups with all the features mentioned above. This function can be used to analyze conditional treatment effects in random control trials in particular, or for two-sample comparisons more generally. For example, a common question in applications is whether two groups have the same “trend” in a regression function, and this is often answered in a restricted way by testing a single interaction term in a (parametric) linear model. In contrast, lspartition delivers a valid measure of this difference nonparametrically and uniformly over the support of the conditioning variables, greatly increasing its flexibility in applications.
All of these contributions are fully implemented for splines, wavelets, and piecewise polynomials through the following four functions included in the package lspartition:
lsprobust()
. This function implements estimation and inference for
partitioning-based least squares regression. It takes the
partitioning scheme as given, and constructs point and variance
estimators, bias correction, conventional and robust bias-corrected
confidence intervals, and simulation-based conventional and robust
bias-corrected uniform inference measures (e.g., confidence bands).
Three approximation bases are provided: B-splines,
Cohen-Daubechies-Vial wavelets, and piecewise polynomials. When the
partitioning scheme is not specified, the companion function
lspkselect()
is used to select a tensor-product partition in a
fully data-driven fashion.lspkselect()
. This function implements data-driven procedures to
select the number of knots for partitioning-based least squares
regression. It allows for evenly-spaced and quantile-spaced knot
placements, and computes the corresponding IMSE-optimal choices. Two
selectors are provided: rule of thumb (ROT) and direct plug-in (DPI)
rule.lsplincom()
. This function implements estimation and robust
inference procedures for linear combinations of regression
estimators of multiple groups based on lsprobust()
. Given a
user-specified linear combination, it offers all the estimation and
inference methods available in the functions lsprobust()
and
lspkselect()
.lsprobust.plot()
. This function builds on
ggplot2 (Wickham and Chang 2016),
and is used as a wrapper for plotting results. It plots regression
function curves, robust bias-corrected confidence intervals and
uniform confidence bands, among other possibilities.The paper continues as follows. The next section describes the basic setup including a brief introduction to partitioning-based least squares regression and the empirical example to be used throughout to illustrate features of lspartition. The third section discusses data-driven IMSE-optimal selection of the number of knots and gives implementation details. Estimation and inference implementation is covered in the fourth section, including bias correction methods. The last section provides concluding remarks. We defer to (Cattaneo et al. 2020) for complete theoretical and technical details. Statements below are sometimes specific versions of a general case therein.
We assume that
Estimation and inference is based on least squares regression of
Once the basis
The approximation power of such estimators increases with the
granularity of the partition
We denote the number of knots in the J
for the regression command lsprobust()
, which
indicates the resolution level of a wavelet basis. This gives
We will showcase the main aspects of lspartition using a running empirical example. The package is available in R and can be installed as follows:
> install.packages("lspartition", dependencies = TRUE)
> library(lspartition)
The data we use come from Capital Bikeshare, and is available at
http://archive.ics.uci.edu/ml/datasets/Bike+Sharing+Dataset/. For the
first 19 days of each month of 2011 and 2012 we observe the outcome
count
, the total number of rentals and the covariates atemp
, the
“feels-like” temperature in Celsius, and workingday
, a binary
indicator for working days (versus weekends and holidays). The data is
summarized as follows.
> data <- read.csv("bikesharing.csv", header = TRUE)
> summary(data)
count atemp workingday
Min. : 1.0 Min. :-14.997 Min. :0.0000
1st Qu.: 42.0 1st Qu.: 5.998 1st Qu.:0.0000
Median :145.0 Median : 15.997 Median :1.0000
Mean :191.6 Mean : 15.225 Mean :0.6809
3rd Qu.:284.0 3rd Qu.: 24.999 3rd Qu.:1.0000
Max. :977.0 Max. : 44.001 Max. :1.0000
We will investigate nonparametrically the relationship between temperature and number of rentals and compare the two groups defined by the type of days:
> y <- data$count
> x <- data$atemp
> g <- data$workingday
The sample code that follows will use this designation of y
, x
, and
g
.
We will now briefly describe the IMSE expansion and its use in tuning
parameter selection. To differentiate the original point estimator of
(1) and the post-bias-correction estimators, we
will add a subscript
The bias expansion for the
The conditional variance is straightforward from least squares algebra
and takes the familiar sandwich form. With
hc0
, hc1
, hc2
, hc3
. See Long and Ervin (2000) for a review
in the context of least squares regression.
In general, for a weighting function
Under regularity conditions on the partition and basis used,
(Cattaneo et al. 2020) derive explicit leading constants in
this expansion. lspartition implements IMSE-minimization for the
common simple case where
To select
Two popular choices of partitioning schemes are evenly-spaced partitions
(ktype="uni"
), which sets ktype="qua"
), which sets imse-rot
) and direct plug-in (imse-dpi
) choices. We close
this section with a brief description of the implementation details and
an illustration using real data.
Rule-of-Thumb Choice
The rule-of-thumb choice is based on the special case of
rotnorm
).The command lspkselect()
implements the rule-of-thumb selection
(kselect="imse-rot"
). For example, we focus on a subsample of bike
rentals during working days (g==1
), and then the selected number of
knots are reported in the following:
> summary(lspkselect(y, x, kselect = "imse-rot", subset = (g ==
+ 1)))
Call: lspkselect
Sample size (n) = 7412
Basis function (method) = B-spline
Order of basis point estimation (m) = 2
Order of derivative (deriv) = (0)
Order of basis bias correction (m.bc) = 3
Knot placement (ktype) = Uniform
Knot method (kselect) = imse-rot
=======================
IMSE-ROT
k k.bc
=======================
5 9
=======================
In this example, for the point estimator based on an evenly-spaced
partition, the rule-of-thumb estimate of the IMSE-optimal number of
knots is
Direct Plug-in Choice
Assuming the weighting
proj
).The following shows the results of the direct plug-in procedure based on the real data:
> summary(lspkselect(y, x, kselect = "imse-dpi", subset = (g ==
+ 1)))
Call: lspkselect
Sample size (n) = 7412
Basis function (method) = B-spline
Order of basis point estimation (m) = 2
Order of derivative (deriv) = (0)
Order of basis bias correction (m.bc) = 3
Knot placement (ktype) = Uniform
Knot method (kselect) = imse-dpi
=======================
IMSE-DPI
k k.bc
=======================
8 10
=======================
The direct plug-in procedure gives more partitioning knots than the
rule-of-thumb, leading to a finer partition. For point estimation,
ktype = "qua"
.
This section reviews and illustrates the estimation and inference procedures implemented. A crucial ingredient is the bias correction that allows for valid inference after tuning parameter selection.
The estimator
Our bias correction strategies are based on (2) and
(3), where the only unknowns are
bc="bc1"
.bc="bc2"
.bc="bc3"
.The optimal (uncorrected) point estimator (
Pointwise inference relies on a Gaussian approximation for the
For conventional confidence intervals (
We now illustrate the pointwise inference features of lsprobust()
using the bike rental data. The previous result of knot selection based
on the DPI procedure will be employed. Specifically, we set nknot=8
for point estimation. For higher-order-basis bias correction
(bc="bc1"
), the same number of knots is used to correct bias by
default, while for plug-in bias correction (bc="bc3"
), we use bnknot=10
) to estimate the higher-order derivatives in the
leading bias. One may leave these options unspecified and then the
command lsprobust()
will automatically implement knot selection using
the command lspkselect()
.
> est_workday_bc1 <- lsprobust(y, x, neval = 20, bc = "bc1", nknot = 8,
+ subset = (g == 1))
> est_workday_bc3 <- lsprobust(y, x, neval = 20, bc = "bc3", nknot = 8,
+ bnknot = 10, subset = (g == 1))
> summary(est_workday_bc1)
Call: lprobust
Sample size (n) = 7412
Num. covariates (d) = 1
Basis function (method) = B-spline
Order of basis point estimation (m) = 2
Order of derivative (deriv) = (0)
Order of basis bias correction (m.bc) = 3
Smoothness point estimation (smooth) = 0
Smoothness bias correction (bsmooth) = 1
Knot placement (ktype) = Uniform
Knots method (kselect) = User-specified
Uniform inference method (uni.method) = NA
Num. knots point estimation (nknot) = (8)
Num. knots bias correction (bnknot) = (8)
=================================================================
Eval Point Std. Robust B.C.
X1 n Est. Error [ 95% C.I. ]
=================================================================
1 -2.998 7412 90.667 5.316 [77.610 , 96.347]
2 -0.002 7412 110.509 3.909 [100.736 , 119.604]
3 1.998 7412 123.937 3.580 [115.071 , 133.583]
4 3.998 7412 137.364 5.183 [129.929 , 144.504]
5 5.998 7412 148.437 3.627 [139.724 , 158.148]
-----------------------------------------------------------------
6 7.001 7412 153.989 3.571 [144.494 , 164.327]
7 11.001 7412 173.306 5.690 [164.945 , 181.894]
8 11.997 7412 174.599 4.600 [167.492 , 186.141]
9 13.997 7412 177.194 3.771 [171.250 , 190.769]
10 15.997 7412 179.789 5.300 [173.561 , 189.839]
-----------------------------------------------------------------
11 17.000 7412 182.743 5.708 [172.595 , 189.229]
12 18.003 7412 189.044 4.662 [172.267 , 191.494]
13 19.000 7412 195.303 4.070 [174.665 , 196.009]
14 22.003 7412 214.165 5.899 [201.197 , 220.363]
15 24.003 7412 231.911 5.770 [228.211 , 248.431]
-----------------------------------------------------------------
16 24.999 7412 243.335 4.760 [239.920 , 262.104]
17 26.002 7412 254.833 4.486 [251.063 , 273.840]
18 28.002 7412 277.755 6.284 [270.701 , 291.816]
19 30.002 7412 298.199 7.278 [280.463 , 309.527]
20 32.002 7412 313.696 6.596 [289.109 , 324.772]
-----------------------------------------------------------------
=================================================================
The above table summarizes the results for pointwise estimation and
inference, including point estimates, conventional standard errors, and
robust confidence intervals based on higher-order-basis bias correction
for lsprobust.plot()
to visualize the results:
> lsprobust.plot(est_workday_bc1, xlabel = "Temperature", ylabel = "Number of Rentals",
+ legendGroups = "Working Days") + theme(text = element_text(size = 17),
+ legend.position = c(0.15, 0.9))
> ggsave("output/pointwise1.pdf", width = 6.8, height = 5.5)
> lsprobust.plot(est_workday_bc3, xlabel = "Temperature", ylabel = "Number of Rentals") +
+ theme(text = element_text(size = 17), legend.position = "none")
> ggsave("output/pointwise2.pdf", width = 6.8, height = 5.5)
|
|
y
-axis) and temperature (x
-axis)
during working days. The solid curves are the point estimates, and the
shaded regions are robust confidence intervals. a shows the results
based on higher-order-basis correction, and b shows the results based on
plug-in bias correction. We see that as the temperature increases, so
does the number of rentals, and that lspartition
gives a
valid visualization of this trend.
The result is displayed in Figure 1. As the
temperature gets higher, the number of rentals increases as expected.
Both panels show the same point estimator,
To obtain uniform inference (over the support of
The Gaussian stochastic process
uni.method="pl"
.uni.method="wb"
.Importantly, these strong approximations apply to the whole
lsprobust()
will output the the following
quantities for uniform analyses upon setting uni.out=TRUE
:
t.num.pl, t.num.wb1, t.num.wb2
. The numerators of approximation
processes except the “simulated components”, which are evaluated at
a set of pre-specified grid points t.num.pl
, is the t.num.wb1
and t.num.wb2
, which are
t.denom
. The denominator of approximation processes, i.e.,
res
. Residuals from the specified bias-corrected regression
(needed for bootstrap-based approximation).For example, the following command requests the necessary quantities for uniform inference based on the plug-in method:
> est_workday_bc1 <- lsprobust(y, x, bc = "bc1", nknot = 4, uni.method = "pl",
+ uni.ngrid = 100, uni.out = T, subset = (g == 1))
> round(est_workday_bc1$uni.output$t.num.pl[1:5, ], 3)
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 30.549 -4.923 2.311 -1.470 0.779 -0.451 0.121
[2,] 27.104 -3.553 1.746 -1.162 0.620 -0.354 0.090
[3,] 23.856 -2.285 1.236 -0.880 0.474 -0.266 0.062
[4,] 20.803 -1.117 0.780 -0.624 0.341 -0.185 0.037
[5,] 17.946 -0.052 0.379 -0.395 0.221 -0.113 0.014
We list the first nknot=4
, the higher-order-basis bias correction is equivalent to
quadratic spline fitting. Thus the numerator matrix has
As a special application, these results can be used to construct uniform
confidence bands, which builds on the suprema of
lsprobust()
computes the
critical value to construct confidence bands. Specifically, it generates
many simulated realizations of
> est_workday_bc1 <- lsprobust(y, x, neval = 20, bc = "bc1", uni.method = "pl",
+ nknot = 8, subset = (g == 1), band = T)
> est_workday_bc1$sup.cval
95%
2.993436
Once the critical value is available, the command lsprobust.plot()
is
able to visualize confidence bands:
> lsprobust.plot(est_workday_bc1, CS = "all", xlabel = "Temperature",
+ ylabel = "Number of Rentals", legendGroups = "Working Days") +
+ theme(text = element_text(size = 17), legend.position = c(0.15,
+
+ 0.9))
> ggsave("output/uniform1.pdf", width = 6.8, height = 5.5)
The result is displayed in Figure 2. Since we set
CS="all"
, the command simultaneously plots pointwise confidence
intervals (error bars) and a uniform confidence band (shaded region).
It is also possible to specify other bias correction approaches or uniform methods:
> est_workday_bc3 <- lsprobust(y, x, neval = 20, bc = "bc3", nknot = 8,
+ bnknot = 10, uni.method = "wb", subset = (g == 1), band = T)
> est_workday_bc3$sup.cval
95%
3.009244
> lsprobust.plot(est_workday_bc3, CS = "all", xlabel = "Temperature",
+ ylabel = "Number of Rentals", legendGroups = "Working Days") +
+ theme(text = element_text(size = 17), legend.position = c(0.15,
+
+ 0.9))
> ggsave("output/uniform2.pdf", width = 6.8, height = 5.5)
The result is displayed in Figure 3. In this example, the critical values based on different methods are quite close, but in general their difference could be more pronounced in finite samples. See (Cattaneo et al. 2020) for some simulation evidence.
The package lspartition also includes a function lsplincom()
, which
implements estimation and inference for a linear combination of
regression functions of different subgroups. To be concrete, consider a
random trial with
To implement estimation and inference for lsplincom()
first calls lsprobust()
to obtain a point estimate
The standard error of
lsplincom()
also allows users to construct confidence bands for
lsprobust()
to output the
numerators (t.num.pl
for “plug-in”, or t.num.wb1
and t.num.wb2
for
“bootstrap”) and denominators (t.denom
) of the feasible approximation
processes
Given these processes, inference is implemented by sampling from
As an illustration, we compare the number of rentals during working days
and other time periods (weekends and holidays) based on linear splines
and plug-in bias correction. To begin with, we first estimate the
conditional mean function for each group using the command
lsprobust()
.
> est_workday <- lsprobust(y, x, neval = 20, bc = "bc3", nknot = 8,
+ subset = (g == 1))
> est_nworkday <- lsprobust(y, x, neval = 20, bc = "bc3", nknot = 8,
+ subset = (g == 0))
> lsprobust.plot(est_workday, est_nworkday, legendGroups = c("Working Days",
+ "Nonworking Days"), xlabel = "Temperature", ylabel = "Number of Rentals",
+ lty = c(1, 2)) + theme(text = element_text(size = 17), legend.position = c(0.2,
+ 0.85))
> ggsave("output/diff1.pdf", width = 6.8, height = 5.5)
The pointwise results for each group are displayed in Figure 4. The shaded regions represent confidence intervals. Clearly, when the temperature is low, two regions are well separated, implying that people may rent bikes more during working days than weekends or holidays when the weather is cold.
Next, we employ the command lsplincom()
to formally test this result.
We specify R=(-1, 1)
, denoting that workingday==0
and workingday==1
.
> diff <- lsplincom(y, x, data$workingday, R = c(-1, 1), band = T,
+ cb.method = "pl")
> summary(diff)
Call: lprobust
Sample size (n) = 10886
Num. covariates (d) = 1
Num. groups (G) = 2
Basis function (method) = B-spline
Order of basis point estimation (m) = 2
Order of derivative (deriv) = (0)
Order of basis bias correction (m.bc) = 3
Smoothness point estimation (smooth) = 0
Smoothness bias correction (bsmooth) = 1
Knot placement (ktype) = Uniform
Knots method (kselect) = imse-dpi
Confidence band method (cb.method) = Plug-in
=========================================================
Eval Point Std. Robust B.C.
X1 Est. Error [ 95% C.I. ]
=========================================================
1 -2.998 32.170 6.077 [24.120 , 47.837]
2 -0.002 49.661 5.552 [37.497 , 61.394]
3 1.998 39.749 4.553 [30.882 , 51.186]
4 3.998 29.838 6.463 [17.013 , 42.425]
5 5.998 17.571 7.049 [3.137 , 30.514]
---------------------------------------------------------
6 7.001 16.300 6.121 [4.717 , 29.559]
7 9.997 12.569 7.733 [-4.275 , 26.973]
8 11.997 3.039 8.339 [-12.379 , 19.761]
9 13.000 1.653 7.540 [-9.502 , 21.073]
10 15.000 3.060 6.664 [-13.960 , 14.078]
---------------------------------------------------------
11 17.000 6.118 8.836 [-6.110 , 27.954]
12 18.003 11.823 9.513 [-2.996 , 33.270]
13 19.000 12.311 9.746 [-23.007 , 15.243]
14 22.003 -17.533 8.520 [-20.891 , 15.791]
15 24.003 -32.221 10.024 [-49.905 , -11.277]
---------------------------------------------------------
16 24.999 -36.962 11.016 [-67.843 , -25.825]
17 26.002 -31.760 9.171 [-37.713 , -1.062]
18 28.002 -21.347 8.789 [-46.161 , -9.332]
19 30.002 -13.412 11.053 [-34.039 , 8.122]
20 32.002 -15.438 11.606 [-44.170 , 1.813]
---------------------------------------------------------
=========================================================
The pointwise results are summarized in the above table. Clearly, when
the temperature is low, the point estimate of the rental difference is
significantly positive since the robust confidence intervals do not
cover lsprobust.plot()
to
plot point estimates, confidence intervals and uniform band
simultaneously:
> lsprobust.plot(diff, CS = "all", xlabel = "Temperature", ylabel = "Number of Rentals",
+ legendGroups = "Difference between Working and Other Days") +
+ theme(text = element_text(size = 17), legend.position = c(0.36,
+
+ 0.2))
> ggsave("output/diff2.pdf", width = 6.8, height = 5.5)
In addition, some basic options for the command lsprobust()
may be
passed on to the command lsplincom()
. For example, the following code
generates a smoother fit of the rental difference by setting m=3
:
> diff <- lsplincom(y, x, data$workingday, R = c(-1, 1), band = T,
+ cb.method = "pl", m = 3)
> lsprobust.plot(diff, CS = "all", xlabel = "Temperature", ylabel = "Number of Rentals") +
+ theme(text = element_text(size = 17), legend.position = "none")
> ggsave("output/diff3.pdf", width = 6.8, height = 5.5)
|
|
y
-axis plots the
difference in the number of rentals, and the x
-axis plots
the temperature. The solid curves show the point estimates, error bars
show the robust confidence intervals, and the shaded regions show the
robust confidence bands. Results in a are based on a linear basis, and
those in b are based on a quadratic basis. The uniformly valid
confidence bands used here provide an assessment of the difference
between the groups overall, compared to the pointwise results in Figure
.
The results are shown in Figure 5. The confidence band for the difference is constructed based on the plug-in distributional approximation computed previously. It leads to an even stronger conclusion: the entire difference as a function of temperature is significantly positive uniformly over a range of low temperatures since the confidence band is above zero when the temperature is low.
We gave an introduction to the software package lspartition, which offers estimation and robust inference procedures (both pointwise and uniform) for partitioning-based least squares regression. In particular, splines, wavelets, and piecewise polynomials are implemented. The main underlying methodologies were illustrated empirically using real data. Finally, installation details, scripts replicating the numerical results reported herein, links to software repositories, and other companion information, can be found in the package’s website:
Phylogenetics, Spatial, TeachingStatistics
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Cattaneo, et al., "lspartition: Partitioning-Based Least Squares Regression", The R Journal, 2020
BibTeX citation
@article{RJ-2020-005, author = {Cattaneo, Matias D. and Farrell, Max H. and Feng, Yingjie}, title = {lspartition: Partitioning-Based Least Squares Regression}, journal = {The R Journal}, year = {2020}, note = {https://rjournal.github.io/}, volume = {12}, issue = {1}, issn = {2073-4859}, pages = {172-187} }