This paper describes an R package LeArEst that can be used for
estimating object dimensions from a noisy image. The package is based
on a simple parametric model for data that are drawn from uniform
distribution contaminated by an additive error. Our package is able to
estimate the length of the object of interest on a given straight line
that intersects it, as well as to estimate the object area when it is
elliptically shaped. The input data may be a numerical vector or an
image in JPEG format. In this paper, background statistical models and
methods for the package are summarized, and the algorithms and key
functions implemented are described. Also, examples that demonstrate
its usage are provided.
Availability: LeArEst is available on CRAN.
Image noise may arise by the physical processes of imaging, or it can be caused by the presence of some unwanted structures (e.g., soft tissue captured in X-ray images of bones). Such problems can occur, for example, when the object is observed with a fluorescent microscope (Ruzin et al. 1999), ground penetrating radar, medical equipment (X-ray, ultrasound), etc. With the presence of additive noise, the detection of the object edge as well as determining length or area of the object becomes a non-trivial problem. The well known edge detection methods (Canny 1986; Qiu 2005) generally do not perform well.
Our approach does not use the mentioned edge detection methods, but looks at the problem in a different way. We start with a simple univariate model where the data represent independent realizations of a random variable \(X\), \(X = U + \varepsilon\). In the aforementioned equation \(U\) is supposed to be uniformly distributed over the object image and describes the object image without an error, while \(\varepsilon\) represents measurement error. It is shown that such a simple model can also be very useful in applications itself, not only related to the image analysis. For instance, in Tolić et al. (2017) the sum of uniform and normal distributions is confirmed to be the most representative distribution for modelling transmission loss data.
Different aspects of this model are developed in Benšić and Sabo (2007b), Benšić and Sabo (2007a), Benšić and Sabo (2010), Benšić and Sabo (2016), Sabo and Benšić (2009), and Schneeweiss (2004). The basic one-dimensional model is described in Section 2 together with the results that are used for statistical inference incorporated in the package. Although this model is not universal in all applications, we find it useful in some cases.
With the assumption that the observed object has a circular or elliptical shape, a two-dimensional approach has been developed, dealing with an object area estimation problem (Benšić and Sabo 2007a; Sabo and Benšić 2009). This approach utilizes many border estimations and performs parametric curve fitting on its results.
The package LeArEst (Bensic et al. 2017) uses these methods for length and area estimation of an object captured with noise. It supports numerical inputs, which is useful if a machine that records an object stores numerical data (coordinates of recorded points). However, if an object is captured in a picture file, the package includes a web interface with which one can load a picture, specify a line that intersects the object, adjust the parameters, and perform an edge detection on the drawn line. Another web interface allows the user to draw a rectangle around the object and perform area estimation of the marked object. Description of functions dealing with numerical and graphical estimations and examples of their use are given in Section 3.
The basic model we deal with in this package is an additive error model \[X=U+\varepsilon.\] Here we suppose that the random variable \(U\) is uniformly distributed on the interval \(\left[-a,a \right]\), \(a>0\), i.e., has a density \[\label{duniform} f_{U}(x;a)=\left\{\begin{array}{ll}\frac{1}{2a},& x\in [-a,a].\\ 0,&\mbox{else}\end{array}\right. \tag{1}\] and \(\varepsilon\) is an absolutely continuous random variable describing measurement error. Further, we assume that \(\varepsilon\) is independent of \(U\) with zero mean. Instead of the sample \(U_{1},U_{2},\ldots,U_{n}\) from \(U\), one can only observe the contaminated i.i.d. sample \(X_{1},X_{2},\ldots,X_{n}\) from \(X\). We are to estimate the unknown parameter \(a\).
Our model is the special case of a general additive error model \(X=Y+\varepsilon\), where \(Y\) and \(\varepsilon\) are assumed to be independent continuous random variables, but only \(X\) is observable. Estimating the unknown density \(f_{Y}\) from an i.i.d. sample \(X_{1},X_{2},\ldots,X_{n},X_{1}\sim X\), is known as the deconvolution problem. Usually, the error part \(\varepsilon\) is assumed to have a known density \(f_{\varepsilon}\). Several nonparametric methods have been developed to estimate \(f_{Y}\) (Meister 2009); the most popular and studied is the deconvolution kernel density estimator (Carroll and Hall 1988; Stefanski and Carroll 1990). The packages decon (Wang and Wang 2013) and deamer (Stirnemann et al. 2012) provide functions for estimating density \(f_{Y}\) in a nonparametric way. Two different approaches in estimating the support of a density from a contaminated sample can be seen in (Meister 2006) and (Delaigle and Gijbels 2006). One is based on the deconvolving kernel density estimator (Delaigle and Gijbels 2006) and the other on the moment estimation (Meister 2006). To our knowledge, none of them is implemented in some R package submitted to the CRAN repository.
For our purpose (e.g., estimating borders of some object from a noisy image), we find that the model with \(Y\sim\mathcal{U}[-a,a]\) is useful in some instances. Namely, in many cases we have a relatively high contrast between an object and its background as it is the case in Figure 1. It seems reasonable to assume that the data extracted from the green line in Figure 1 come from a uniform distribution, but contaminated by an additive error.
Let \(f_{\varepsilon}\) and \(F_{\varepsilon}\) be the density and distribution function of the error part \(\varepsilon\). Then the density of \(X=U+\varepsilon\) is \[\label{fX1} f_{X}(x;a)=\int_{-\infty}^{\infty}f_{U}(t)f_{\varepsilon}(x-t)dt=\frac{1}{2a}\left(F_\varepsilon(x+a)-F_\varepsilon(x-a)\right). \tag{2}\] If we suppose that the distribution of \(\varepsilon\) belongs to a scale family, with scale parameter \(\sigma\), then ((2)) may be rewritten as
\[\label{fX2} f_{X}(x;a,\sigma)=\frac{1}{2a}\left(F\left( \frac{x+a}{\sigma}\right) -F\left( \frac{x-a}{\sigma}\right) \right), \tag{3}\] with \(F(x)\) being the standard (\(\sigma=1\) and zero mean) distribution function.
Let \(\mathbf{x}=\left(x_{1},\ldots,x_{n}\right)\) denote the realization of the i.i.d. sample \(\mathbf{X}=\left(X_{1},\ldots,X_{n}\right)\). The likelihood function has the form \[\label{lh} L\left(a,\sigma;\mathbf{x} \right)=\prod_{i=1}^{n}f_{X}(x_{i};a,\sigma)=\frac{1}{(2a)^{n}}\prod_{i=1}^{n}\left(F\left(\frac{x_{i}+a}{\sigma}\right)-F\left(\frac{x_{i}-a}{\sigma}\right)\right), \tag{4}\] and the log-likelihood function is given by \[\label{llh} l(a,\sigma)=-n\log(2a)+\sum_{i=1}^{n}\log \left(F\left(\frac{x_{i}+a}{\sigma}\right)-F\left(\frac{x_{i}-a}{\sigma}\right)\right). \tag{5}\] If the distribution of \(\varepsilon\) is symmetric around zero, then the Fisher information is \[I(a)=\frac{-1}{a^{2}}+\frac{1}{a\sigma^{2}}\int_{0}^{\infty}\frac{\left( f\left(\frac{x+a}{\sigma}\right)+f\left(\frac{x-a}{\sigma}\right)\right)^{2} }{F\left(\frac{x+a}{\sigma}\right)-F\left(\frac{x-a}{\sigma}\right)}\,dx.\] Supposing that parameter \(\sigma\) is known or consistently estimated then, under regularity, we have \[\label{ad} \sqrt{n}\left(\hat{a}_{ML}-a_{0} \right)\stackrel{D}\longrightarrow\mathcal{N}\left(0,\frac{1}{I(a_{0})}\right), \tag{6}\] where \(a_{0}\) is the true value of \(a\).
Some flexibility of our model is achieved by changing the error distribution. For now, three types of error distributions are available in the package. The normal distribution \(\varepsilon\sim\mathcal{N}\left(0,\sigma^{2}\right)\) is sometimes a natural choice. Properties of maximum likelihood (ML) and method of moments (MM) estimators with known \(\sigma^{2}\) are given in Benšić and Sabo (2007b). The model with \(Y\sim\mathcal{U}[0,a]\) is treated in Schneeweiss (2004). Benšić and Sabo (2010) considered the unknown \(\sigma^{2}\) case. The possibility of this (one-dimensional) model in two-dimensional problems is given in Benšić and Sabo (2007a) and Sabo and Benšić (2009). For the sake of robustness Laplace and scaled Student (with \(5\) degrees of freedom) distributions are also incorporated in the package as a choice of the error distribution. Estimating issues with Laplacian error can be seen in Benšić and Sabo (2016), as well as a discussion connected to robustness.
Two procedures for deriving confidence intervals for \(a\) are described in (Hamedović et al. 2017). The first one is based on the asymptotic distribution of ML estimator in ((6)). For a specified \(0<\alpha<1\), an asymptotic \((1-\alpha)100\%\) confidence interval for \(a\) is1
\[\left(\hat{a}_{ML}-\frac{z_{\alpha /2}}{\sqrt{nI(\hat{a}_{ML}) } },\hat{a}_{ML}+\frac{z_{\alpha /2}}{\sqrt{nI(\hat{a}_{ML}) }} \right).\] The second method is based on the likelihood ratio statistic \[\lambda(\mathbf{X})=\frac{\sup\limits_{H_{0}}L\left(a;\mathbf{X} \right)}{\displaystyle\sup_{(0,\infty)}L\left(a;\mathbf{X} \right)}=\frac{L\left(a_{0};\mathbf{X} \right)}{L\left(\hat{a}_{ML};\mathbf{X} \right)}.\] From the asymptotic distribution of the log-likelihood ratio statistic \[-2\log \lambda(\mathbf{X}) \stackrel{D}\longrightarrow \chi_{1}^{2}\] an approximate \((1-\alpha)100\%\) confidence interval for \(a\) is \[\left\lbrace a\vert\, l(\hat{a}_{ML})-l(a)\leq 0.5\chi^{2}_{1}(1-\alpha)\right\rbrace,\] where \(\chi^{2}_{1}(1-\alpha)\) is the \(1-\alpha\) quantile of \(\chi_{1}^{2}\) distribution.
These two approaches can be used to test hypotheses regarding the parameter \(a\). For example, in the case of a two-sided hypotheses \(H_{0}:a=a_{0}\) versus \(H_{1}:a\neq a_{0}\), the critical regions of asymptotic size \(\alpha\) are \[\left\lbrace \mathbf{x}\vert\, \sqrt{nI(a_{0})}|\hat{a}_{ML}-a_{0}|\geq z_{\alpha /2} \right\rbrace, \text{ and}\]
\[\left\lbrace \mathbf{x}\vert\, -2\log \lambda(\mathbf{x})\geq \chi_{1}^{2}(1-\alpha) \right\rbrace,\] respectively. Note that both methods are asymptotically equivalent.
The package LeArEst depends on the following packages that should installed in addition to the LeArEst package: conicfit (Gama and Chernov 2015), jpeg (Urbanek 2014), and opencpu (Ooms 2014). The stable version of the package is available on the Comprehensive R Archive Network repository (CRAN; https://CRAN.R-project.org/) and can be downloaded and installed by issuing the following command at the R console:
> install.packages("LeArEst")
The package is loaded using the following command:
> library(LeArEst)
An overview of the package’s functions is given in Table 1.
Function | Description |
---|---|
lengthest() |
Performs length estimation from a numerical data set. |
lengthtest() |
Performs one-sided and two-sided tests for uniform |
distribution half-length. | |
areaest() |
Performs area estimation of a numerically described object |
in plane. | |
startweb.esttest() |
Opens default web browser and loads a web page for length |
estimation and testing (the object of interest is shown in | |
an image). | |
startweb.area() |
Opens default web browser and loads a web page for area |
estimation of the object shown in an image. |
The function lengthest
computes the length of an interval which is the
domain of a uniform distribution from data contaminated by an additive
error according to the model described in the previous section. The
function’s arguments and results are given in Table
2.
Arguments | Description |
---|---|
x |
Vector of input data. |
error |
Error distribution. |
var |
Error variance. |
var.est |
Method of error variance estimation. |
conf.level |
Confidence level of the confidence interval. Defaults to \(0.95\). |
Results | Description |
radius |
Estimated half-length of the uniform support. |
var.error |
Error variance, estimated or explicitly given by argument var . |
conf.int |
Confidence interval for half-length of the uniform support. |
method |
Method used for computing a confidence interval |
(asymptotic distribution of ML or likelihood ratio statistic). |
In order to perform length estimation, a type of the error distribution
must be chosen through the argument error
with three possibilities:
laplace
(Benšić and Sabo 2016), gauss
(Benšić and Sabo 2007b, 2010), or student
(scaled
Student distribution with 5 degrees of freedom).
The variance of the additive error may or may not be known. If the
variance is known, argument var
should be used and the variance should
be assigned to it. In the case of unknown variance, function lengthest
implements two methods for its estimation: Method of Moments and
Maximum Likelihood. Value MM
of the argument var.est
instructs the
functions to use Method of Moments, while the corresponding value ML
triggers Maximum Likelihood Method. There is the possibility, depending
on the input data, that the Method of Moments estimate of error variance
does not exist. When that is the case, the function stops and outputs
the message instructing the user to use Maximum Likelihood estimator or
to give an explicit variance. It is important to mention that arguments
var
and var.est
may not be used simultaneously.
The last argument, conf.level
, specifies the confidence level of the
confidence interval calculated by the function.
The results of this function are the estimated half-length of uniform distribution (i.e., of an object), estimated or explicitly given error variance, confidence interval for half-length (with regard to the given confidence level) and the statistical method for computing a confidence interval.
Usage example. Let us generate a sample of size \(1000\) from \(X=U+\varepsilon\), where \(U\sim\mathcal{U}\left[-1, 1\right]\) and \(\varepsilon\sim\mathcal{N}\left(0,(0.1)^{2}\right)\):
set.seed(12)
<- runif(1000, -1, 1)
sample_1 <- rnorm(1000, 0, 0.1)
sample_2 <- sample_1 + sample_2 sample
Figure 2 shows density estimation from the
generated data obtained with the R function density
. A half-length
estimation of the uniform support for these data can be done with the
following command:
lengthest(x = sample, error = "gauss", var.est = "MM", conf.level = 0.90)
The most important part of its output is:
$radius
for radius (a) of uniform distr.: 0.9916513
MLE $var.error
for error variance: 0.01279636
MM estimate $method
1] "Asymptotic distribution of LR statistic"
[$conf.int
1] 0.9724316 1.0116479 [
Function lengthtest
performs one-sided and two-sided tests against
hypothesized half-length of the uniform support as it is described in
Section 2. Since the actual calculations
inside this function are based on the ML approach most input arguments
are similar to those in the function lengthest
(see Table
3). Argument null.a
is a positive number
representing hypothesized half-length of the uniform support, while
argument alternative
defines the usual forms of alternatives
(two.sided
, greater
, or less
).
Arguments | Description |
---|---|
x |
Vector of input data. |
error |
Error distribution. |
null.a |
Specified null value being tested. |
alternative |
The form of the alternative hypothesis. |
var |
Error variance. |
var.est |
Method of error variance estimation. |
conf.level |
Confidence level of the confidence interval. Defaults to \(0.95\). |
Results | Description |
p.value |
p-value of the test. |
tstat |
The value of the test statistic. |
radius |
Estimated half-length of the uniform support. |
var.error |
Error variance, estimated or explicitly given by argument var . |
conf.int |
Confidence interval for half-length. |
method |
Method used for computing a confidence interval |
(asymptotic distribution of ML or likelihood ratio statistic). |
Function lengthtest
also performs length estimation, so all values
from its output, except p.value
and the calculated value of the test
statistic (tstat
), are the same as that of the function lengthest
.
Usage example. Generate the data in a similar manner as in the
lengthest
example:
set.seed(12)
<- runif(1000, -1, 1)
sample_1 <- rnorm(1000, 0, 0.1)
sample_2 <- sample_1 + sample_2 sample
To test that the uniform support half-length equals \(1\) against that it
is less than \(1\) the function lengthest
can be used in the following
way:
lengthtest(x = sample, error = "gauss", alternative = "less", var.est = "MM",
null.a = 1, conf.level = 0.95)
Part of the output dealing with a testing procedure is:
$p.value
1] 0.2418929
[$tstat
1] -0.7002265 [
The input for the function areaest
is supposed to be a data set of
points in the plane representing independent realizations of a
two-dimensional random vector
\[X=U+\varepsilon.\]
It is assumed that \(U\) has a uniform distribution on an ellipsoid and
\(\varepsilon\) is a two-dimensional error term independent of \(U\).
Arguments and results of this function are listed in Table
4.
Arguments | Description |
---|---|
data |
Two-column data matrix containing the points that describe the |
observed object. The first column represents the \(x\) coordinate of the point, | |
while the second column represents its \(y\) coordinate. | |
nrSlices |
Number of slices applied for plain data cutting. Defaults to 10. |
error |
Error distribution. |
var |
Error variance. |
var.est |
Method of error variance estimation. |
plot |
Logical parameter that determines whether to plot data set, calculated |
edge points, and the resulting ellipse. Defaults to FALSE . |
|
Results | Description |
area |
Estimated area of the object. |
points |
Set of estimated object’s edge points. |
semiaxes |
Resulting ellipse’s semi-axes. |
The algorithm implemented in the function areaest
is explained in
detail in (Benšić and Sabo 2007a). The main task in area estimation is to
estimate edge points of the uniform support. In order to achieve this,
the original problem is reduced to several corresponding one-dimensional
problems, which can in turn be solved by function lengthest
.
Let us denote the data set with
\(D = \left\{\left(x_i, y_i \right), i = 1, \dots, n \right\}\). The
function areast
transforms this data set in two different ways:
through the \(y\)-axis and through the \(x\)-axis. The algorithm for
transformation through the \(y\)-axis is presented below, while the
transformation through the \(x\)-axis is done analogously.
(Transformation through the \(y\)-axis (Benšić and Sabo 2007a))
Separating through the \(y\)-axis.
Choose an integer \(m<n\) and real numbers \(\eta_1 < \eta_2 < \dots < \eta_m\) such that
\(\eta_{1} \le \min \{ y_i : i=1,\dots n \}, \ \max \{ y_i : i=1,\dots n\} \le \eta_{m}\) and
\(C_k := \{(x_i,y_i) \in D : y_i \in [ \eta_k, \eta_{k+1}] \},\ k=1, \ldots, m-1\) is a nonempty set.
Centering through the \(y\)-axis.
Let us denote \[c_k := \frac{1}{\left|C_k\right| } \sum_{\left(x_i,y_i\right) \in C_k } x_i, \ d_k := \frac{1}{\left|C_k\right|} \sum_{\left(x_i,y_i\right) \in C_k } y_i,\]
\[k=1,\dots,m-1,\] for \(k=1,\dots,m-1\) define \(\overline C_k := \left\{ x_i-c_k : \left( x_i, y_i \right) \in C_k \right\}\).
Using this algorithm the data are transformed in the way that we have
sets \(\overline C_k,\ k=1,\dots,m-1\) that represent centered tiny
strips. Argument nrSlices
corresponds to \(m-1\) and specifies the
number of strips. The lengths of these strips (in \(x\)-direction) can be
estimated using the function lengthest
(the parameters error
, var
,
and var.est
are used in a lengthest
call in the way described
earlier). After doing so, the algorithm needs to be repeated through the
\(x\)-axis. Finally, at this point of the procedure, the data that is a
noisy version of points from the curve is created – it represents
estimated points from the border of the object.
The next task is to choose one of the well-known curve fitting procedures for parameter estimation. Here we are dealing with a nonlinear parameter estimation problem.
Let us suppose that we have an elliptical domain, i.e.,
\[\mathcal{D}(\mathbf{p}) = \left\{ (x,y) \in \mathbb{R}^2 : \frac{(x-p)^2}{\alpha^2} + \frac{(y-q)^2}{\beta^2} \le 1 \right\},\]
\[\mathbf{p} = \left(p, q, \alpha, \beta\right)^T.\]
On the basis of data obtained so far, the vector of unknown parameters
\(p\) needs to be estimated, and by doing so, the optimal ellipse that
fits into our points is to be defined. For this purpose
EllipseDirectFit
function from the conicfit package is used.
This function implemets the algebraic ellipse fit method by
Fitzgibbon-Pilu-Fisher (Fitzgibbon et al. 1999). Having parameters \(p\),
it is a trivial task to calculate the area of the ellipse that
approximates the observed object.
Usage example. Two internal files are provided with the package:
ellipse_3_4_0.1_gauss.txt
and ellipse_3_4_0.1_laplace.txt
. Both of
them represent an ellipsoidal object with center in point \((1,1)\),
half-axes \(1.5\) and \(2\), with added measurement error. In the first
file, the error distribution is a two-dimensional normal with
independent margins and variance 0.01, while in the other it is
Laplacian (\(\lambda=0.1\)) in both directions, again with independent
margins.
In order to use one of these files, the data needs to be read into a data frame:
<- system.file("extdata", "ellipse_3_4_0.1_laplace.txt", package = "LeArEst")
inputfile <- read.table(inputfile) inputdata
Area estimation of the uniform support can be done with the command:
areaest(inputdata, error = "laplace", var.est = "ML", nrSlices = 5, plot = TRUE)
In the previous example, the parameter plot
is set to TRUE
, so the
function plots the given input data (black dots), estimated border
points (red dots), and the resulting ellipse (cyan ellipse); see Figure
3.
The most important parts of the numerical output are:
$area
1] 9.938305
[$semiaxes
1] 2.048028 1.544638 [
In order to apply the described methods to a picture of an object, two web interfaces have been built and embedded into the package.
As far as we know, shiny (Chang et al. 2017) provides the simplest way of building web applications using R. However, limitations of its free version discouraged us from using it, so we decided to use the opencpu package. This package provides a reliable and interoperable HTTP API for data analysis based on R. Basically, it provides an interface between functions in R package and a custom-made web page bundled with the package, using JavaScript and AJAX. Building web interfaces using opencpu is more complex than using shiny, but at the same time, it provides more flexibility in application design. It is assumed that developers are familiar with HTTP protocol, HTML, and the JavaScript language, in order to develop such web applications.
Function startweb.esttest
will be described in this section. This
function takes no arguments and returns no results, its task is to start
a web interface for length estimation and hypothesis testing (Figure
4).
To start the analysis using the web interface the picture in JPG format should be loaded (Load Picture button). Then, a line should be drawn that intersects the object of interest by clicking on two points in the picture - length estimation will be performed on that line. Finally, parameters for a data set preparation should be set.
The Levels of gray
parameter determines how many levels of grey the
algorithm should take into account. It is important to mention that,
although color images can be loaded, they are internally converted to
grey-scales prior to any calculations. Since JPG format supports
\(2^{24}\) different colors, the number of possible colors should be
reduced in order to optimize estimation speed and memory consumption.
Line thickness
specifies how many picture pixels around the drawn line
are taken into account in length estimation. For instance, if
Line thickness
is set to \(3\), the algorithm takes pixels which are
direct left neighbours of the line, pixels on the line itself, and the
ones which are direct right neighbours of the line. By doing so, the
matrix of (length of the line) \(\times\) Line thickness
pixels –
PixelMatrix is obtained.
By doing so, we have obtained the matrix of (length of the line)
\(\times\) Line thickness
pixels – PixelMatrix.
By clicking on the Prepare data button the data set will be prepared for the inference.
The following step deals with data preparation and is a crucial step of
the algorithm. Each pixel of PixelMatrix is mapped to a new matrix of
Box size
\(\times\) Box size
booleans – DotMatrix (note that
Box size
is a parameter). Further, every DotMatrix is filled with
uniformly distributed dots (i.e., TRUE
values) in a way that total
number of dots in each DotMatrix corresponds to the brightness of the
pixel it represents. Then, DotMatrices are tiled up with respect to
the position of corresponding pixel and, by doing so, a new matrix of
(length of the line \(\cdot\) Box size
) \(\times\) (Line thickness
\(\cdot\) Box size
) booleans is obtained – FinalDotMatrix. The last
step is to summarize rows of the FinalDotMatrix to obtain a vector of
(length of the line) \(\cdot\) Box size
integers. The vector’s histogram
is shown on a web interface (Figure 4, at the
bottom) and the vector itself serves as an input to the functions
lengthest
or lengthtest
.
Parameters in the Estimation section of the web interface are
transferred to lengthest
, as well. After the user clicks on Estimate
button, lengthest
is executed, and its results are displayed below the
picture. Additionally, the estimated uniform support is marked red on
the intersecting line.
Estimated length is expressed in width of a pixel and in percentage of whole image’s width as well. As stated in the info box, it is important to use a proportional screen resolution on user’s display, so the pixels on the screen are square-shaped.
The Testing section of this web interface serves for hypothesis
testing. Procedures related to image loading, choosing an intersecting
line, and data preparation are the same as described above. For the
purpose of testing, values for H0
, unit
, and alternative
(greater
, less
, or two-sided
) need to be specified. The part of
the web interface dealing with output of hypothesis testing procedure is
shown in Figure 5.
The function startweb.area
starts a web interface for area estimation
(Figure 6). Again, the first step is to load an image.
To select an object whose surface needs to be evaluated, a rectangle
should be drawn around it. It is done by clicking on its upper-left and
lower-right corners, after which a green rectangle is drawn on the
picture.
Data parameters are similar to ones in the length estimation web
interface, with the exception of number of slices
.
The first step in the area estimation algorithm for this function is to
roughly isolate the object in the selected rectangle. In order to do
that, pixels from the selected rectangle are divided into two clusters
by using the kmeans
function from base-R stats package (the
criterion for clustering is pixel brightness). Further, only pixels from
the ‘object cluster’ are observed and divided into horizontal and
vertical stripes, as described earlier in Algorithm
1. The number of stripes is dictated by the
number of slices
parameter. A length estimation procedure is conducted
on each stripe, obtaining two estimated edge points of the object for
each stripe (red dots in Figure 5). Two parameters in
the Estimation section of the web interface are related to the length
estimation procedure of the stripes.
Finally, an optimal ellipse that fits into edge points is found using
EllipseDirectFit
function from the conicfit package, as well as
in the areaest
function described earlier. The resulting ellipse is
drawn in red in Figure 5. Its area is printed below,
this is measured in pixels and the percentage of the whole image area.
The R package LeArEst provides routines for estimating the support of the random variable \(U,\,U\sim\mathcal{U}[-a,a],\) based on a sample from \(X=U+\varepsilon\). The random variable \(\varepsilon\) represents additive measurement error and is supposed to have either a normal, Laplace, or scaled Student distribution with 5 degrees of freedom. The package also includes functions for estimating either the borders or the area of some object from a noisy image. The package may be useful for this purpose mainly in the case of images with reasonable contrast between the object of interest and background. For greater robustness, we find it convenient to use some error distributions with heavier tails. Sometimes we have different amounts of noise in the tails, so it would be useful to include some asymmetric error distributions as well. These are some features we are going to add in the package in order to improve flexibility of our models.
This work was supported by the Croatian Science Foundation through research grant IP-2016-06-6545. We would like to thank Krunoslav Buljan from Osijek Clinical Hospital Center for providing an image of the carotid artery.
LeArEst, decon, deamer, conicfit, jpeg, opencpu, shiny
ModelDeployment, NumericalMathematics, WebTechnologies
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Benšić, et al., "LeArEst: Length and Area Estimation from Data Measured with Additive Error", The R Journal, 2017
BibTeX citation
@article{RJ-2017-043, author = {Benšić, Mirta and Taler, Petar and Hamedović, Safet and Nyarko, Emmanuel Karlo and Sabo, Kristian}, title = {LeArEst: Length and Area Estimation from Data Measured with Additive Error}, journal = {The R Journal}, year = {2017}, note = {https://rjournal.github.io/}, volume = {9}, issue = {2}, issn = {2073-4859}, pages = {461-473} }