In this article we present tmvtnorm, an R package implementation for the truncated multivariate normal distribution. We consider random number generation with rejection and Gibbs sampling, computation of marginal densities as well as computation of the mean and covariance of the truncated variables. This contribution brings together latest research in this field and provides useful methods for both scholars and practitioners when working with truncated normal variables.
The package mvtnorm is
the first choice in R for dealing with the Multivariate Normal
Distribution (Genz et al. 2009). However, frequently one or more variates in a
multivariate normal setting \(\mathbb{\mathbf{x}}=(x_1,\ldots,x_m)^T\) are
subject to one-sided or two-sided truncation (\(a_i \le x_i \le b_i\)).
The package tmvtnorm
((Wilhelm and Manjunath 2010)) is an extension of the
mvtnorm package and
provides methods necessary to work with the truncated multivariate
normal case defined by
\(TN(\mathbb{\mu},\mathbb{\Sigma},\mathbb{\mathbf{a}},\mathbb{\mathbf{b}})\).
The probability density function for
\(TN(\mathbb{\mu},\mathbb{\Sigma},\mathbb{\mathbf{a}}, \mathbb{\mathbf{b}})\)
can be expressed as:
\[f(\mathbb{\mathbf{x}},\mathbb{\mu},\mathbb{\Sigma},\mathbb{\mathbf{a}},\mathbb{\mathbf{b}}) =
\frac{
\exp{\left\{ -\frac{1}{2} (\mathbb{\mathbf{x}}-\mathbb{\mu})^T \mathbb{\Sigma}^{-1} (\mathbb{\mathbf{x}}-\mathbb{\mu}) \right\}}
}
{ \int_{\mathbb{\mathbf{a}}}^{\mathbb{\mathbf{b}}}{\exp{\left\{ -\frac{1}{2} (\mathbb{\mathbf{x}}-\mathbb{\mu})^T \mathbb{\Sigma}^{-1} (\mathbb{\mathbf{x}}-\mathbb{\mu}) \right\} } d\mathbb{\mathbf{x}}}
}\] for
\(\mathbb{\mathbf{a}}\le \mathbb{\mathbf{x}}\le \mathbb{\mathbf{b}}\) and
\(0\) otherwise. We will use the following bivariate example with
\(\mathbb{\mu}= (0.5, 0.5)^T\) and covariance matrix \(\mathbb{\Sigma}\)
\[\mathbb{\Sigma}= \left(
\begin{array}{cc}
1 & 0.8 \\
0.8 & 2
\end{array}
\right)\] as well as lower and upper truncation points
\(\mathbb{\mathbf{a}}=(-1, -\infty)^T , \mathbb{\mathbf{b}}= (0.5, 4)^T\), i.e. \(x_1\) is doubly, while \(x_2\) is
singly truncated. The setup in R code is:
> library(tmvtnorm)
> mu <- c(0.5, 0.5)
> sigma <- matrix(c(1, 0.8, 0.8, 2), 2, 2)
> a <- c(-1, -Inf)
> b <- c(0.5, 4)
The following tasks for dealing with the truncated multinormal distribution are described in this article:
generation of random numbers
computation of marginal densities
computation of moments (mean vector, covariance matrix)
estimation of parameters
The first idea to generate variates from a truncated multivariate normal
distribution is to draw from the untruncated distribution using
rmvnorm()
in the
mvtnorm package and to
accept only those samples inside the support region (i.e., rejection
sampling). The different algorithms used to generate samples from the
multivariate normal distribution have been presented for instance in
(Mi et al. 2009) and in (Genz and Bretz 2009).
The following R code can be used to generate \(N=10000\) samples using rejection sampling:
> X <- rtmvnorm(n=10000, mean=mu,
> sigma=sigma, lower=a, upper=b,
> algorithm="rejection")
The parameter algorithm="rejection"
is the default value and may be
omitted.
This approach works well, if the acceptance rate \(\alpha\) is reasonably
high. The method rtmvnorm()
first calculates the acceptance rate
\(\alpha\) and then iteratively generates \(N/\alpha\) draws in order to get
the desired number of remaining samples \(N\). This typically needs just a
few iterations and is fast. Figure 1 shows random samples
obtained by rejection sampling. Accepted samples are marked as black,
discarded samples are shown in gray. In the example, the acceptance rate
is \(\alpha = 0.432\).
> alpha <- pmvnorm(lower=a, upper=b, mean=mu,
> sigma=sigma)
However, if only a small fraction of samples is accepted, as is the case in higher dimensions, the number of draws per sample can be quite substantial. Then rejection sampling becomes very inefficient and finally breaks down.
The second approach to generating random samples is to use the Gibbs
sampler, a Markov Chain Monte Carlo (MCMC) technique. The Gibbs sampler
samples from conditional univariate distributions
\(f(x_i | \mathbb{\mathbf{x}}_{-i})=f(x_i | x_1,\ldots,x_{i-1},x_{i+1},\ldots,x_m)\)
which are themselves truncated univariate normal distributions (see for
instance (Horrace 2005) and (Kotecha and Djuric 1999)). We implemented the
algorithm presented in the paper of (Kotecha and Djuric 1999) in the
tmvtnorm package with
minor improvements. Using algorithm="gibbs"
, users can sample with the
Gibbs sampler.
> X <- rtmvnorm(n=10000, mean=mu,
> sigma=sigma, lower=a, upper=b,
> algorithm="gibbs")
Gibbs sampling produces a Markov chain which finally converges to a
stationary target distribution. Generally, the convergence of MCMC has
to be checked using diagnostic tools (see for instance the
coda package from
(Plummer et al. 2009)). The starting value of the chain can be set as an
optional parameter start.value
. Like all MCMC methods, the first
iterates of the chain will not have the exact target distribution and
are strongly dependent on the start value. To reduce this kind of
problem, the first iterations are considered as a burn-in period which a
user might want to discard. Using burn.in.samples=100
, one can drop
the first 100 samples of a chain as burn-in samples.
The major advantage of Gibbs sampling is that it accepts all proposals
and is therefore not affected by a poor acceptance rate \(\alpha\). As a
major drawback, random samples produced by Gibbs sampling are not
independent, but correlated. The degree of correlation depends on the
covariance matrix \(\mathbb{\Sigma}\) as well on the dimensionality and
should be checked for using autocorrelation or cross-correlation plots
(e.g. acf()
or ccf()
in the
stats package).
> acf(X)
Taking only a nonsuccessive subsequence of the Markov chain, say every
\(k\)-th sample, can greatly reduce the autocorrelation among random
points as shown in the next code example. This technique called
"thinning" is available via the thinning=k
argument:
> X2 <- rtmvnorm(n=10000, mean=mu,
> sigma=sigma, lower=a, upper=b,
> algorithm="gibbs", burn.in.samples=100,
> thinning = 5)
> acf(X2)
While the samples X
generated by Gibbs sampling exhibit
cross-correlation for both variates up to lag 1, this correlation
vanished in X2
.
Table 1 shows a comparison of rejection sampling and Gibbs
sampling in terms of speed. Both algorithms scale with the number of
samples \(N\) and the number of dimensions \(m\), but Gibbs sampling is
independent from the acceptance rate \(\alpha\) and even works for very
small \(\alpha\). On the other hand, the Gibbs sampler scales linearly
with thinning
and requires extensive loops which are slow in an
interpreted language like R. From package version 0.8, we reimplemented
the Gibbs sampler in Fortran. This compiled code is now competitive with
rejection sampling. The deprecated R code is still accessible in the
package via algorithm="gibbsR"
.
Algorithm | Samples | \(m=2\) | \(m=10\) | ||||
---|---|---|---|---|---|---|---|
\(R_s=\prod_{i=1}^{2}{(-1, 1]}\) | \(R_s=\prod_{i=1}^{10}{(-1, 1]}\) | \(R_s=\prod_{i=1}^{10}{(-4, -3]}\) | |||||
\(\alpha=0.56\) | \(\alpha=0.267\) | \(\alpha=5.6 \cdot 10^{-6}\) | |||||
Rejection Sampling | \(N=10000\) | 5600 | 0.19 sec | 2670 | 0.66 sec | 0 | \(\infty\) |
\(N=100000\) | 56000 | 1.89 sec | 26700 | 5.56 sec | 1 | \(\infty\) | |
Gibbs Sampling (R code, no thinning) | \(N=10000\) | 10000 | 0.75 sec | 10000 | 3.47 sec | 10000 | 3.44 sec |
\(N=100000\) | 100000 | 7.18 sec | 100000 | 34.58 sec | 100000 | 34.58 sec | |
Gibbs Sampling (Fortran code, no thinning) | \(N=10000\) | 10000 | 0.03 sec | 10000 | 0.13 sec | 10000 | 0.12 sec |
\(N=100000\) | 100000 | 0.22 sec | 100000 | 1.25 sec | 100000 | 1.21 sec | |
Gibbs Sampling (Fortran code and thinning=10 ) |
\(N=10000\) | 10000 | 0.22 sec | 10000 | 1.22 sec | 10000 | 1.21 sec |
\(N=100000\) | 100000 | 2.19 sec | 100000 | 12.24 sec | 100000 | 12.19 sec |
The joint density function
\(f(\mathbb{\mathbf{x}},\mathbb{\mu},\mathbb{\Sigma},\mathbb{\mathbf{a}},\mathbb{\mathbf{b}})\)
for the truncated variables can be computed using dtmvnorm()
. Figure
2 shows a contour plot of the bivariate
density in our example.
However, more often marginal densities of one or two variables will be
of interest to users. For multinormal variates the marginal
distributions are again normal. This property does not hold for the
truncated case. The marginals of truncated multinormal variates are not
truncated normal in general. An explicit formula for calculating the
one-dimensional marginal density \(f(x_i)\) is given in the paper of
(Cartinhour 1990). We have implemented this algorithm in the method
dtmvnorm.marginal()
which, for convenience, is wrapped in dtmvnorm()
via a margin=i
argument. Figure 3 shows the
Kernel density estimate and the one-dimensional marginal for both \(x_1\)
and \(x_2\) calculated using the dtmvnorm(..., margin=i)
\(i=1,2\) call:
> x <- seq(-1, 0.5, by=0.1)
> fx <- dtmvnorm(x, mu, sigma,
> lower=a, upper=b, margin=1)
The bivariate marginal density \(f(x_q, x_r)\) for \(x_q\) and \(x_r\)
(\(m > 2, q \ne r\)) is implemented based on the works of (Leppard and Tallis 1989)
and (Manjunath and Wilhelm 2009) in method dtmvnorm.marginal2()
which, again for
convenience, is just as good as dtmvnorm(..., margin=c(q,r))
:
> fx <- dtmvnorm(x=c(0.5, 1),
> mu, sigma,
> lower=a, upper=b, margin=c(1,2))
The help page for this function in the package contains an example of how to produce a density contour plot for a trivariate setting similar to the one shown in figure 2.
The computation of the first and second moments (mean vector
\(\mathbb{\mu}^{*}=E[\mathbb{\mathbf{x}}]\) and covariance matrix
\(\mathbb{\Sigma}^{*}\) respectively) is not trivial for the truncated
case, since they are obviously not the same as \(\mathbb{\mu}\) and
\(\mathbb{\Sigma}\) from the parametrization of
\(TN(\mathbb{\mu},\mathbb{\Sigma},\mathbb{\mathbf{a}},\mathbb{\mathbf{b}})\).
The moment-generating function has been given first by (Tallis 1961),
but his formulas treated only the special case of a singly-truncated
distribution with a correlation matrix \(\boldsymbol{R}\). Later works,
especially (Lee 1979) generalized the computation of moments for a
covariance matrix \(\mathbb{\Sigma}\) and gave recurrence relations
between moments, but did not give formulas for the moments of the
double-truncated case. We presented the computation of moments for the
general double-truncated case in a working paper (Manjunath and Wilhelm 2009) and
implemented the algorithm in the method mtmvnorm()
, which can be used
as
> moments <- mtmvnorm(mean=mu, sigma=sigma,
> lower=a, upper=b)
The moment calculation for our example results in \(\mathbb{\mu}^{*} = (-0.122, 0)^T\) and covariance matrix \[\mathbb{\Sigma}^{*} = \left( \begin{array}{cc} 0.165 & 0.131 \\ 0.131 & 1.458 \end{array} \right)\] To compare these results with the sample moments, use
> colMeans(X)
> cov(X)
It can be seen in this example that truncation can significantly reduce the variance and change the covariance between variables.
Unlike our example, in many settings \(\mathbb{\mu}\) and \(\mathbb{\Sigma}\) are unknown and must be estimated from the data. Estimation of \((\mathbb{\mu},\mathbb{\Sigma})\) when truncation points \(\mathbb{\mathbf{a}}\) and \(\mathbb{\mathbf{b}}\) are known can typically be done by either Maximum-Likelihood, Instrumental Variables ((Amemiya 1973), (Amemiya 1974), (Lee 1979), (Lee 1983)) or in a Bayesian context ((Griffiths 2002)). We are planning to include these estimation approaches in the package in future releases.
The package tmvtnorm provides methods for simulating from truncated multinormal distributions (rejection and Gibbs sampling), calculations from marginal densities and also calculation of mean and covariance of the truncated variables. We hope that many useRs will find this package useful when dealing with truncated normal variables.
tmvtnorm, mvtnorm, coda, stats
Bayesian, Distributions, Finance, GraphicalModels
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Wilhelm & Manjunath, "tmvtnorm: A Package for the Truncated Multivariate Normal Distribution", The R Journal, 2010
BibTeX citation
@article{RJ-2010-005, author = {Wilhelm, Stefan and Manjunath, B. G.}, title = {tmvtnorm: A Package for the Truncated Multivariate Normal Distribution}, journal = {The R Journal}, year = {2010}, note = {https://rjournal.github.io/}, volume = {2}, issue = {1}, issn = {2073-4859}, pages = {25-29} }