shinybrms: Fitting Bayesian Regression Models Using a Graphical User Interface for the R Package brms

Frank Weber; Katja Ickstadt; Änne Glass

1 Introduction

The relevance of regression models in applied research has already been well pointed out, for example, by Karabatsos (2015):

“Regression modeling is ubiquitous in empirical areas of scientific research. This is because most research questions can be asked in terms of how a dependent variable changes as a function of one or more covariates (predictors).”

Conducting regression analyses in a Bayesian framework has a lot of advantages. Introductory texts on Bayesian statistics (in general) are, e.g., McElreath (2020), Albert and Hu (2019), Reich and Ghosh (2019), StataCorp (2019a), Gelman et al. (2020a), and Johnson et al. (2022). For readers with little background in Bayesian statistics, we recommend reading one of these textbooks first. A more detailed introduction may be found, e.g., in Gelman et al. (2014). In particular, Bayesian statistics has the following advantages (for further advantages, see, e.g., Gelman et al. 2014; StataCorp 2019a):

Bayesian methods allow for incorporation of prior knowledge. Generally, inclusion of prior knowledge is desirable: The flat prior implied by the frequentist maximum likelihood (ML) method may lead to nonsensical inferences (Gelman et al. 2017). Even if the inclusion of informative prior knowledge is not desired, weakly informative priors have the advantage (compared to so-called “noninformative” priors¹) to downweight unreasonable parameter values and to introduce a certain regularization or penalization, helping against overfitting (Gelman 2006; Gelman et al. 2008, 2014, 2017). Hereafter, we follow conventional notation in Bayesian statistics and denote the prior for the parameter vector \(\boldsymbol{\theta}\) by \(p(\boldsymbol{\theta})\).
Similarly, the prior distribution may be used to impose parameter constraints in an easy and natural way. There is no need for ad-hoc solutions to cut off parameter estimates. For example, many frequentist between-study variance estimators in the random-effects meta-analysis model are cut off at zero.
It is usually possible to infer the posterior exactly (apart from minor approximations such as those arising from the Monte Carlo error). In that case, Bayesian statistics does not need to resort to large-sample approximations such as the asymptotic normal distribution of the ML estimator often used in frequentist statistics.
For most practical cases, Markov chain Monte Carlo (MCMC) sampling (see section 2.1) constitutes a generic Bayesian inference method. In frequentist statistics, generic methods such as the asymptotic normal distribution of the ML estimator can be unsatisfactory, e.g., for small sample sizes. This is why in frequentist statistics, different inferential methods have often evolved for the same task or model. This complicates frequentist analyses for users, especially for those with little background in statistics.
The quantities derived from the posterior have a more intuitive interpretation than their frequentist counterparts which are based on the sampling distribution of the estimator. In particular, Bayesian posterior intervals (credible intervals, CrIs) have the interpretation that is often incorrectly attributed to frequentist confidence intervals (CIs) (McElreath 2020) and posterior tail-area probabilities have the interpretation that is often incorrectly attributed to frequentist \(p\)-values.
In Bayesian statistics, uncertainty in nuisance parameters is easily—and naturally—taken into account by integrating them out from the posterior: \[p(\boldsymbol{\psi} | \boldsymbol{\mathcal{D}}) = \int p(\boldsymbol{\psi}, \boldsymbol{\phi} | \boldsymbol{\mathcal{D}}) \; \text{d}\boldsymbol{\phi} \label{eqn:nuisance} \tag{1}\] with \(\boldsymbol{\mathcal{D}}\) denoting the data, \(\boldsymbol{\psi}\) the parameter vector of interest, and \(\boldsymbol{\phi}\) the nuisance parameter vector (so that \((\boldsymbol{\psi}^{\scriptscriptstyle \mathsf{T}}, \boldsymbol{\phi}^{\scriptscriptstyle \mathsf{T}})^{\scriptscriptstyle \mathsf{T}}= \boldsymbol{\theta}\))². Taking uncertainty in nuisance parameters into account helps against overfitting (like the penalization mentioned in enumeration point number 1 above), but in general, this is not that easy in frequentist statistics, as can be seen in random-effects meta-analyses (Weber et al. 2021).
When combined with probabilistic programming (as done, e.g., by the various sampling methods introduced in section 2), a Bayesian analysis naturally propagates the posterior uncertainty into derived quantities (Gelman et al. 2020a).
Often, frequentist analyses result in the typical null-hypothesis significance testing which is being criticized to an increasing degree (Amrhein and Greenland 2018; Amrhein et al. 2019; McShane et al. 2019). Null-hypothesis significance testing is especially problematic for null hypotheses consisting of only a point in parameter space. Of course, Bayesian analyses may also result in null-hypothesis significance testing or similar hypothesis-testing procedures, but in our experience, this is not as common as in frequentist analyses. We designed our software presented here in a way that does not encourage null-hypothesis significance testing.
Posterior predictive checks (PPCs)—which should be part of a Bayesian workflow (Gelman et al. 2020b)—are an easy and intuitive way of performing model diagnostics in a Bayesian framework, even though the choice and interpretation of the PPCs require some experience (Gelman et al. 2020b). In a frequentist framework, model diagnostics are often not that easy to perform, at least if the uncertainty from parameter estimation should be taken into account.

These advantages will be illustrated in the context of the example from section 5, by comparing our Bayesian analysis to a frequentist one (see section “Frequentist analysis of the example” in the online Supplement file Supplement_sections.pdf).

Despite the aforementioned advantages, Bayesian methods—and Bayesian regression models (BRMs) in particular—are still not as common as their frequentist counterparts. In 2005, Woodward (2005) supposed one reason to be the lack of a “good user interface” which would allow applied researchers to fit BRMs as conveniently as other statistical methods for which a graphical user interface (GUI) already exists. In the meantime, several GUIs have emerged (see section 3), but to our knowledge, until the first release of our R package shinybrms (Weber 2022), there was no GUI which used Stan (Carpenter et al. 2017; Stan Development Team 2022c) for inferring the posterior in BRMs. Stan has several advantages compared to other methods for inferring the posterior. In particular, it is highly flexible with respect to modeling choices and very efficient. Details will be given in section 2.

Our shinybrms package is noncommercial and available at the Comprehensive R Archive Network (CRAN). While shinybrms’s frontend is a shiny (Chang et al. 2022) app, shinybrms’s backend completely relies on brms (Bürkner 2017, 2018) which itself relies on Stan. Both of brms’s backends (i.e., interfaces to Stan), namely rstan (Stan Development Team 2022a) and cmdstanr (Gabry and Češnovar 2022), are supported by shinybrms. For the inspection of the Stan output, the shinystan (Gabry 2022) app may be launched from within shinybrms.

To explain the particular advantages of Stan in detail, we have to take a closer look at different ways for inferring the posterior. This is the purpose of section 2. In section 3, we summarize existing GUIs for BRMs. That section is partly influenced by Ramírez-Hassan and Graciano-Londoño (2021). In section 4, we present the features of shinybrms. The usage of the shinybrms app is illustrated by the help of a real-world example in section 5. Finally, we discuss our work in section 6.

2 Algorithms for inferring the posterior

As mentioned above, in Bayesian statistics, uncertainty arising from the estimation of nuisance parameters is taken into account by integrating them out from the posterior. This is not the only integration occurring in posterior inference: Basically every quantity derived from the posterior is somehow connected to an integration over the posterior. However, it is the integration which also causes a lot of complications. While it is most desirable to perform posterior inference by exact calculation of the desired integrals (using analytic expressions), this approach is often infeasible and even if it is feasible, it has the downside of being not as flexible as other approaches since it needs to be tailored to the statistical model at hand. Numerical integration (e.g., by quadrature) may seem like a remedy, but is often only feasible up to a limited dimensionality of the parameter space. Depending on the algorithm, numerical integration may also introduce tuning quantities, hindering its “out-of-the-box” usage. Integration by simple Monte Carlo (MC) sampling may seem like an alternative, but this is only possible for distributions one may directly sample from (e.g., a Gaussian distribution).

Markov chain Monte Carlo

With the advent of Markov chain Monte Carlo (MCMC) methods, Bayesian inference has changed a lot (Woodward 2005; Lunn et al. 2009). The first MCMC algorithm was the Metropolis algorithm (Metropolis et al. 1953) which starts from an initial point in parameter space and iteratively samples a proposal³ from a symmetric⁴ jumping distribution and accepts the proposal with a certain acceptance probability which depends on the ratio of the target (here, the posterior) density at the current position and at the proposal. The Metropolis-Hastings (MH) algorithm (Metropolis et al. 1953; Hastings 1970) generalizes the Metropolis algorithm to asymmetric jumping distributions. Gibbs sampling (Geman and Geman 1984; Gelfand and Smith 1990) consists of alternately sampling from the full conditional posterior distributions and is a special MH algorithm in which the proposal is always accepted (Gelman et al. 2014). Combinations of the aforementioned algorithms are also widely used, e.g., MH-within-Gibbs. All MCMC algorithms (including those mentioned hereafter) require a careful examination of the convergence of the Markov chains. The MCMC diagnostics used for this purpose in shinybrms are outlined in section 4.4.2.

Hamiltonian Monte Carlo (HMC) (initial work and major contributions by Duane et al. 1987; Neal 1993, 2011; MacKay 2003) is a special MCMC algorithm which is often more efficient than other MCMC algorithms, especially in case of a high-dimensional posterior distribution and correlated parameters (Hoffman and Gelman 2014; Betancourt 2018). The efficiency of HMC is due to the fact that it takes advantage of the gradient of the (log) posterior density (Stan Development Team 2022d), making it a combination of stochastic and deterministic procedures (which explains why HMC is also known as hybrid Monte Carlo) (Gelman et al. 2014). HMC provides helpful diagnostics, such as divergent transitions which can (but must not necessarily) indicate areas of the posterior which are hard to explore by the HMC sampler (Betancourt 2018; Gabry et al. 2019). Compared to Gibbs sampling, HMC also has the advantage that nonconjugate priors may be used easily. For the original HMC algorithm, three tuning quantities need to be specified by hand in advance: the mass matrix \(\boldsymbol{M}\) (which is the covariance matrix of the auxiliary momentum vector), the number \(L\) of leapfrog steps, and the size \(\epsilon\) of the leapfrog steps (Gelman et al. 2014; Stan Development Team 2022d).

Because of the fixed choice of \(L\), the original HMC algorithm is a static HMC algorithm (Betancourt 2018). In contrast, the no-U-turn sampler (NUTS) (Hoffman and Gelman 2014) is a dynamic HMC algorithm since it automatically chooses a (possibly) new value of \(L\) in each iteration of each Markov chain. Hoffman and Gelman (2014) also proposed a new dual averaging technique for determining \(\epsilon\) automatically, too. Apart from these automations, the NUTS has the advantage that in terms of efficiency, it was shown to perform as well as—or even better than—a well-tuned static HMC algorithm (Hoffman and Gelman 2014). A modified (Betancourt 2018) NUTS is implemented in Stan. Stan’s NUTS also includes an automatic adaptation of the mass matrix \(\boldsymbol{M}\) during the warmup phase (Stan Development Team 2022d). A complete presentation of Stan’s NUTS is out of the scope of this article. A good starting point for a detailed description is Stan Development Team (2022d) as well as Betancourt (2018). Note that Stan also includes other algorithms for inferring or approximating the posterior. In this paper however, we only refer to Stan’s NUTS when referring to Stan.

3 Existing GUIs

Table 1 summarizes existing GUIs for BRMs. Details are provided in Supplement section “Existing GUIs”. Table 1 makes it clear that none of the existing GUIs relies entirely on Stan or the NUTS.

Table 1: Existing GUIs for BRMs. Algorithmic details for the GUIs may be found in Supplement section “Existing GUIs”. “BB” stands for “Bayesian bootstrap”. Here, the term “NUTS” includes Stan’s NUTS. The column “Algorithm choice” specifies if given a model, the user may choose an algorithm (at least for some types of models). Notes: (i) The TEET package is noncommercial, but MATLAB is commercial; (ii) JASP uses (Stan’s) NUTS for Bayesian meta-analyses and mixed BRMs; (iii) For linear regression models, BEsmarter offers a choice between MCMC and the Bayesian bootstrap.
		Algorithm (for inferring the posterior)
		Non-MCMC				MCMC
GUI name	Commercial	Analytic	Numerical	MC	BB	Non-HMC	Static HMC	NUTS	Algorithm choice
WinBUGS (Lunn et al. 2000)	no	no	no	no	no	yes	no	no	no
OpenBUGS (Spiegelhalter et al. 2014)	no	no	no	no	no	yes	yes	no	no
BugsXLA (Woodward 2011)	no	no	no	no	no	yes	yes	no	no
IBM SPSS Amos (Arbuckle 2020)	yes	no	no	no	no	yes	yes	no	yes
TEET (Qian 2011)	no⁽ⁱ⁾	yes	yes	yes	no	yes	no	no	no
JASP (JASP Team 2022)	no	yes	yes	yes	no	yes	no	yes⁽ⁱⁱ⁾	no
BRNPM (Karabatsos 2015; Karabatsos 2017)	no	no	no	no	no	yes	no	no	no
Stata (StataCorp 2019b)	yes	no	no	no	no	yes	no	no	yes
BayES (Emvalomatis 2020)	no	no	no	no	no	yes	no	no	no
IBM SPSS (IBM Corp. 2020)	yes	yes	yes	yes	no	no	no	no	no
BEsmarter ((BEsmarter Team 2020a,b);	no	no	no	no	yes	yes	no	no	yes⁽ⁱⁱⁱ⁾
(Ramírez-Hassan and Graciano-Londoño 2021))

JASP does use Stan for some analyses, but JASP’s concept is quite different from shinybrms’s concept: While JASP offers a plenty of different statistical methods (including non-regression analyses), shinybrms is designed to be as concise as possible. While JASP’s approach of using Stan for only some analyses certainly has a few advantages (especially in terms of runtime), shinybrms’s approach of completely relying on Stan (and brms in particular) has the advantage of a better maintainability: shinybrms only provides a lightweight GUI and only needs to perform few computations on its own. This is due to brms which is very flexible by allowing to fit a variety of regression models within a single R package. This division of work between shinybrms, shiny, brms, rstan, Stan, and shinystan reduces the amount of maintenance necessary for shinybrms, resulting in a faster integration of new features, a faster elimination of bugs, and a longer life cycle. Furthermore, it allows the authors of each component to focus on their strengths.

4 Features of shinybrms

The following general presentation of shinybrms’s features will be in written form, but with links to the corresponding screenshots from the example in section 5. In this article, not all aspects of the shinybrms app are shown in screenshots. For more screenshots, see the shinybrms website (Weber 2022).

Note that the mathematical formulation of the models which may be fit with shinybrms has already been given elsewhere (Bürkner 2017, 2018), so we will keep it short here.

Overview

The shinybrms app has three main pages which are accessible from a navigation bar at the top (Figure 1): “Likelihood”, “Prior”, and “Posterior”.

This structure follows Bayes’ theorem, simplified to the proportionality of the posterior density to the product of prior density and likelihood: \[p(\boldsymbol{\theta} | \boldsymbol{\mathcal{D}}) \propto p(\boldsymbol{\theta}) \cdot p(\boldsymbol{\dot{\mathcal{D}}} | \boldsymbol{\theta}, \boldsymbol{\ddot{\mathcal{D}}}) \label{eqn:bayes} \tag{2}\] where we have split up the data \(\boldsymbol{\mathcal{D}}\) into \(\boldsymbol{\mathcal{D}} = \begin{bmatrix}\boldsymbol{\dot{\mathcal{D}}} \;\; \boldsymbol{\ddot{\mathcal{D}}}\end{bmatrix}\) because in BRMs, the distribution in the likelihood typically conditions on the predictor part \(\boldsymbol{\ddot{\mathcal{D}}}\) of the data (see section 4.2.2 below). In the following sections, these three main pages will be described in detail.

There are also some auxiliary pages, the first two having direct links in the navigation bar, the last three being accessible from the drop-down menu “More” at the end of the navigation bar:

The starting page “Home” gives a short overview of shinybrms’s objective and structure. Thus, it only contains informational text and no interactive elements.
On page “Data”, the user uploads his or her custom dataset which shall be used for the regression analysis. For testing purposes, page “Data” also offers example datasets. The chosen dataset (no matter if it was uploaded or chosen from the list of example datasets) is automatically shown in a preview consisting of the dataset’s first six rows (there is an option to show the full dataset, though). It is also possible to show the output of R’s str() function applied to the chosen dataset, which gives some basic information about the dataset and its variables for users familiar with R.
Page “About” contains basic information about shinybrms (e.g., version and corresponding date) as well as some legal information.
Page “Links” gives links to software relevant for the shinybrms app.
Page “References” contains the references for literature cited throughout the app.

Page “Likelihood”

Page “Likelihood” has three tabs: “Outcome”, “Predictors”, and “Formula preview”. These tabs will now be described in turn.

Tab “Outcome”

On tab “Outcome” (Figure 2), the user specifies the outcome variable \(\boldsymbol{y} = (y_1, \dotsc, y_N)^{\scriptscriptstyle \mathsf{T}}\in \mathbb{R}^N\) (by choosing it from a drop-down list of the variables present in the dataset) as well as its distributional family, i.e., the basic form of the likelihood \(p(\boldsymbol{\dot{\mathcal{D}}} | \boldsymbol{\theta}, \boldsymbol{\ddot{\mathcal{D}}})\), now with \(\boldsymbol{\dot{\mathcal{D}}} = \boldsymbol{y}\). For the distributional family, there is a drop-down menu and a checkbox called “Show advanced distributional families”. By default, this checkbox is unchecked which means that the drop-down menu offers three general distributional families of broad practical relevance: the Gaussian family (with the identity link function), the Bernoulli family with the logit link function, and the negative binomial family with the log link function. This is intended to be a limited selection: By reducing the choices as much as possible, we want to avoid overwhelming the user with a variety of special distributions. For example, the Poisson family is intentionally left out, in favor of the more general negative binomial distribution. However, by checking the “Show advanced distributional families” checkbox, the drop-down menu is extended so that a variety of other distributional families can be selected as well (see Supplement section “Advanced distributional families”).

Tab “Predictors”

If desired, the user may specify predictors on tab “Predictors” (Figure 3). We use the term “predictor” for a column in the model matrix. In contrast, we use the term “predictor variable” for a column in the input dataset. Thus, a predictor may also denote an interaction and a predictor variable with \(K\) categories leads to \(K - 1\) predictors (due to dummy coding).

The shinybrms app supports population-level effects as well as group-level effects⁵. The inclusion of group-level effects yields a multilevel model (also known as hierarchical or mixed-effects model). Here, we denote the vector of population-level effects by \(\boldsymbol{\beta}\) and the corresponding model matrix by \(\boldsymbol{X}\). Likewise, we denote the vector of group-level effects by \(\boldsymbol{u}\) and the corresponding model matrix by \(\boldsymbol{Z}\). The hyperparameters for the group-level effects (i.e., their standard deviations and correlations) will be collected in a vector \(\boldsymbol{\tau}\). If the model does not contain group-level effects, we define here (for the mathematical description, not for the software) \(\boldsymbol{u} = 0\), \(\boldsymbol{Z} = (0, \dotsc, 0)^{\scriptscriptstyle \mathsf{T}}\in \mathbb{R}^N\) (for example; the exact values in \(\boldsymbol{Z}\) do not matter if \(\boldsymbol{u} = 0\)), and \(\boldsymbol{\tau} = 0\) (for example). With \(\boldsymbol{X}\) and \(\boldsymbol{Z}\), we now have \(\boldsymbol{\ddot{\mathcal{D}}} = \begin{bmatrix}\boldsymbol{X} \;\; \boldsymbol{Z}\end{bmatrix}\). We denote the vector of linear predictors by \(\boldsymbol{\eta} = \boldsymbol{X} \boldsymbol{\beta} + \boldsymbol{Z} \boldsymbol{u} \in \mathbb{R}^N\) (written out as \(\boldsymbol{\eta} = (\eta_1, \dotsc, \eta_N)^{\scriptscriptstyle \mathsf{T}}\)).

Note that here as well as in shinybrms, the term “interaction” is also used for interactions involving predictor variables with group-level main effects (yielding group-level interaction effects). This broad definition of “interaction” simplifies the GUI and emphasizes the key concept of interactions, namely that an effect depends on another predictor (or on other predictors).

To avoid common mistakes, shinybrms imposes some restrictions: Firstly, an overall (population-level) intercept is always included. Secondly, including an interaction causes all corresponding lower-order interactions to be automatically included, too. The latter restriction also implies that interactions may only involve predictor variables for which main effects have already been added.

Tab “Formula preview”

The tab “Formula preview” simply combines the chosen outcome and the chosen predictors into brms’s formula syntax. This is mainly intended for checking the correct specification of the model formula (for users familiar with the syntax). However, it also provides a concise and standardized way to communicate this central part of the model because together with the distributional family, this model formula determines the likelihood \(p(\boldsymbol{\dot{\mathcal{D}}} | \boldsymbol{\theta}, \boldsymbol{\ddot{\mathcal{D}}}) = p(\boldsymbol{y} | \boldsymbol{\theta}, \boldsymbol{X}, \boldsymbol{Z})\): In case of the Gaussian family (with the identity link function), we have \[p(\boldsymbol{y} | \boldsymbol{\theta}, \boldsymbol{X}, \boldsymbol{Z}) = \prod_{i = 1}^{N} \frac{1}{\sqrt{2 \pi} \cdot \sigma} \,\exp\!\left(-\frac{1}{2} \cdot \left(\frac{y_i - \mu_i}{\sigma}\right)^2\right) \label{eqn:gauss} \tag{3}\] with \(\mu_i = \eta_i\) (see \(\boldsymbol{\eta}\) from section 4.2.2), \(\sigma \in (0, \infty)\), and \(\boldsymbol{\theta} = (\boldsymbol{\beta}^{\scriptscriptstyle \mathsf{T}}, \boldsymbol{u}^{\scriptscriptstyle \mathsf{T}}, \boldsymbol{\tau}^{\scriptscriptstyle \mathsf{T}}, \sigma)^{\scriptscriptstyle \mathsf{T}}\). Note that the dependence on \(\boldsymbol{X}\) and \(\boldsymbol{Z}\) as well as on most elements of \(\boldsymbol{\theta}\) is an indirect dependence via \(\mu_1, \dotsc, \mu_N\). In case of the Bernoulli family with the logit link function, we have \(\boldsymbol{y} \in \{0, 1\}^N\) and \[p(\boldsymbol{y} | \boldsymbol{\theta}, \boldsymbol{X}, \boldsymbol{Z}) = \prod_{i = 1}^{N} \mu_i^{y_i} (1 - \mu_i)^{(1 - y_i)} \label{eqn:bernoulli} \tag{4}\] with \(\mu_i = \frac{1}{1 + \exp(-\eta_i)}\) and \(\boldsymbol{\theta} = (\boldsymbol{\beta}^{\scriptscriptstyle \mathsf{T}}, \boldsymbol{u}^{\scriptscriptstyle \mathsf{T}}, \boldsymbol{\tau}^{\scriptscriptstyle \mathsf{T}})^{\scriptscriptstyle \mathsf{T}}\). In case of the negative binomial family with the log link function, we have \(\boldsymbol{y} \in \left(\{0\} \cup \mathbb{N}\right)^N\) and \[p(\boldsymbol{y} | \boldsymbol{\theta}, \boldsymbol{X}, \boldsymbol{Z}) = \prod_{i = 1}^{N} \binom{y_i + \zeta - 1}{y_i} \left(\frac{\mu_i}{\mu_i + \zeta}\right)^{y_i} \left(\frac{\zeta}{\mu_i + \zeta}\right)^{\zeta} \label{eqn:negbinom} \tag{5}\] with \(\mu_i = \exp(\eta_i)\), \(\zeta \in (0, \infty)\), and \(\boldsymbol{\theta} = (\boldsymbol{\beta}^{\scriptscriptstyle \mathsf{T}}, \boldsymbol{u}^{\scriptscriptstyle \mathsf{T}}, \boldsymbol{\tau}^{\scriptscriptstyle \mathsf{T}}, \zeta)^{\scriptscriptstyle \mathsf{T}}\). The mathematical details of the “advanced” distributional families (see section 4.2.1 above) may be found in the brms vignette “Parameterization of Response Distributions in brms”.

Page “Prior”

At the top of page “Prior”, the user obtains a preview of the default priors taken from brms (Figure 4). At the bottom, the user may specify custom priors (Figure 5). Both, the default and the custom priors, refer to the parameters of the currently specified likelihood. Custom priors may be specified as follows:

via a Stan function,
via one of the special brms (pseudo-)functions designed for this purpose, e.g., for the Lewandowski-Kurowicka-Joe (LKJ) prior (Lewandowski et al. 2009),
via an empty input field to specify a flat prior over the whole support of the corresponding parameter(s).

The first two possibilities always lead to a proper prior. The flat prior is only proper if the support is bounded on both sides. Otherwise (which is the more common case), the flat prior is improper.

The user’s selections on page “Prior” ultimately lead to the specification of \(p(\boldsymbol{\theta})\). Together with the likelihood, this completes the model specification. Note that for multilevel models, \(p(\boldsymbol{\theta})\) here⁶ includes the distributions of the group-level effects \(\boldsymbol{u}\) as well, even though they do not need to be specified on page “Prior”.

Page “Posterior”

Page “Posterior” has six tabs: “Run Stan”, “MCMC diagnostics”, “Default summary”, “Custom summary”, “Conditional effects”, and “Launch shinystan”. These will now be described in turn. The output shown on these tabs may also be downloaded (with the file format depending on the specific type of output).

Tab “Run Stan”

At the top of tab “Run Stan”, the user may inspect and download the Stan code and the so-called Stan data. The Stan data basically consists of the pre-processed part of the chosen dataset which is needed for the Stan model, extended by some internal objects. Apart from checking or documentation purposes, the Stan code and the Stan data are needed if the user wants to customize the Stan code and then run Stan outside of shinybrms.

Further down on tab “Run Stan”, the user may set advanced options for the Stan run (Figure 6). These options have sensible defaults, but sometimes they need to be changed. Probably the most important option is the seed for the pseudorandom number generator.

The final panel on tab “Run Stan” is the central one (Figure 7): By a click on the “Run Stan” button, Stan translates the Stan code written by brms to C++ code, compiles this C++ code, and then starts sampling. By default, Stan writes its sampling progress to an HTML file which is automatically opened up by shinybrms. The user then only needs to refresh this HTML file to see the current sampling progress. An example for Stan’s runtime will be given in section 5.

When Stan has finished sampling, the panel “Run Stan” automatically refreshes, in particular to show the result from an overall check of the MCMC diagnostics (see section 4.4.2 for details). The user also has the possibility to download different output objects which can be analyzed outside of shinybrms or—in case of the fitted model object of class "brmsfit"—uploaded in a new shinybrms session (to avoid re-running Stan).

Tab “MCMC diagnostics”

On tab “MCMC diagnostics” (Figure 8), the user obtains detailed information concerning the following MCMC diagnostics:

HMC-specific diagnostics:
- the number of iterations ending with a divergence,
- the number of iterations hitting the maximum tree depth,
- the Bayesian fraction of missing information for the energy transitions (E-BFMI);
some general MCMC diagnostics (which are computed for each parameter as well as for the accumulated log posterior density):
- the modified potential scale reduction factor \(\widehat{R}\) proposed by Vehtari et al. (2021) (here simply called the \(\widehat{R}\) instead of the modified \(\widehat{R}\))⁷,
- the effective sample size (ESS) in the bulk of the corresponding marginal posterior (Vehtari et al. 2021),
- the ESS in the tails of the corresponding marginal posterior (Vehtari et al. 2021).

As a full description of these MCMC diagnostics is out of the scope of this article, we refer the interested reader to Stan Development Team (2022b), Betancourt (2018), and Vehtari et al. (2021). The most important basic guidelines for deciding whether these MCMC diagnostics are worrying are explained in the shinybrms GUI and also checked automatically by shinybrms. For the general MCMC diagnostics, it is also possible to show a detailed table with the diagnostics for each parameter (as well as for the accumulated log posterior density).

Tab “Default summary”

Tab “Default summary” (Figure 9) shows brms’s standard robust summary of the posterior inference, e.g., the medians and the central \(95\,\%\) intervals of the marginal posteriors (the \(95\,\%\) CrIs) of the most important parameters. This tab is only intended for a quick inspection. A much more comprehensive analysis of the Stan output is offered by the shinystan app (see section 4.4.6).

Tab “Custom summary”

On tab “Custom summary” (Figure 10), the user may calculate posterior summary quantities for a custom mathematical (or logical) expression involving at least one parameter. Such an expression may be, e.g., a sum of two parameters (as shown in Figure 10) or the event that a parameter exceeds a certain threshold.

Tab “Conditional effects”

On tab “Conditional effects” (Figure 11), shinybrms offers conditional-effects plots (created by brms::conditional_effects()). A conditional-effects plot shows the estimated effect of a predictor variable on the outcome. An interaction effect involving at most two predictor variables may also be visualized by showing the estimated effect of the first predictor variable separately for appropriate values of the second predictor variable.

As described in more detail in the shinybrms GUI, a conditional-effects plot conditions on specific values of those predictor variables which are not involved in the plot. Likewise, group-level effects which are not involved in the plot are (usually) set to zero.

Tab “Launch shinystan”

The shinystan app (Gabry 2022) offers an interactive inspection of Stan (and other MCMC-generated) results, in particular with respect to:

MCMC diagnostics (including several additional diagnostics not covered by shinybrms’s tab “MCMC diagnostics”),
PPCs,
summary quantities of univariate marginal posteriors,
plots of univariate, bivariate, and trivariate marginal posteriors.

Before launching the shinystan app from within the shinybrms app by clicking the corresponding button on tab “Launch shinystan”, the user may set a seed to ensure the reproducibility of the PPCs.

At this point, the shinybrms workflow ends and passes over to the shinystan workflow. We will illustrate the shinystan workflow in section 5.

5 Example

We illustrate shinybrms’s features following the workflow implied by Figure 1 and using a real-world dermatological dataset from Van Welzen et al. (2021). This dataset is available in the Supplement (file CAP.csv). In Supplement section “Frequentist analysis of the example”, we compare the Bayesian analysis presented here with a frequentist one, referring to the list of advantages of Bayesian statistics from section 1.

Van Welzen et al. (2021) conducted a prospective pilot study investigating the efficacy and safety of a novel cold atmospheric plasma (CAP) wound dressing for the healing of split-skin graft donor sites. The only outcome we focus on here is the tissue water index (TWI), measured by a hyperspectral imaging camera and having values between 0 and 100, with lower TWI values being associated with an improved wound healing. Briefly, the study design was as follows: For each of \(P = 10\) patients, the TWI was measured under \(T = 3\) different treatment conditions (standard-treated wound, CAP-treated wound, and healthy skin). Each treatment condition was investigated in its own skin area with \(R = 3\) measurements across that area. This procedure of measuring was repeated on each of \(D = 4\) days (day 1, 3, 5, and 7, with day 1 being the day of the split-skin graft donation where the TWI was measured after the split-skin graft donation but before the first wound dressing). Thus, the dataset consists of \(N = P \cdot T \cdot R \cdot D = 360\) observations (rows). The dataset’s columns are:

patID (for “patient ID”; coded as "pat1", …, "pat10"),
age (in years),
anticoagulation (indicating whether the patient received an anticoagulation therapy before and during the study; coded as "no" and "yes"),
diabetes (indicating whether the patient is diabetic; coded as "no" and "yes"),
day (coded as "d1", "d3", "d5", and "d7"),
trt (coded as "0_standard", "CAP", and "healthy" to make "0_standard" the reference level),
TWI (integers in the interval \([0, 100]\)).

The primary research question is whether the CAP treatment leads to a decreased⁸ TWI compared to the standard treatment (polyhexanide wound gel with fatty gauze). The healthy skin area serves as an experimental control (albeit not as a control treatment since this is the role of the standard-treated wound area) and is not part of the primary research question.

shinybrms

After launching the shinybrms app in R via

> library("shinybrms")
> launch_shinybrms(launch.browser = TRUE)

(with launch.browser set to TRUE to ensure that the app is opened up in the default web browser), we switch to page “Data” where we upload the dataset (not shown here).

Next, we head over to page “Likelihood”. On tab “Outcome” (Figure 2), we choose the outcome variable TWI and the Gaussian family as the distributional family for this outcome. Clearly, the TWI values cannot follow an unmodified Gaussian distribution since they are bounded by \(0\) and \(100\), with the minimum of the observed TWI values being indeed as low as \(9\) (the maximum being \(67\)). Thus, a truncated Gaussian distribution might be more appropriate here. We will come back to this later in section 6.

graphic without alt text — Figure 2: Tab “Outcome” on page “Likelihood”. In the example presented here, we select `TWI` as the outcome variable and the Gaussian family as the distributional family for this outcome. Tab “Outcome” and tab “Predictors” (Figure ) are the two main components of page “Likelihood”.

On tab “Predictors” (Figure 3), we choose age, anticoagulation, diabetes, day, and trt to have population-level main effects and patID to have group-level main effects (“random intercepts”). Further down on tab “Predictors”, we add an interaction between day and trt (not shown here). This interaction is included because the TWI is supposed to show a stronger time-dependence in the two wound areas than in the healthy skin area. Additionally, the TWI difference (in means) between the standard and the CAP treatment might change over time.

Now the likelihood is set up, so we can proceed with the prior. The default priors (Figure 4) are reasonable, but suppose we wanted a weakly informative Student-\(t\) prior with \(3\) degrees of freedom, a location parameter of \(0\), and a scale parameter of \(30\) for all regression coefficients. To add this custom prior, we choose parameter class b from the corresponding drop-down list shown in Figure 5, enter student_t(3, 0, 30) into the input field entitled “Prior distribution”, and click the “Add prior” button. After doing so, our Student-\(t\) prior is added to the preview table (Figure 5, right-hand side). For all remaining parameters for which we do not specify a custom prior, the corresponding default prior will be used.

Now the model is fully set up, so we can start inferring the posterior. To do this, we switch to page “Posterior” where we scroll down to the advanced options on tab “Run Stan” (Figure 6). There, we set a seed for reproducibility. Afterwards, we scroll further down to panel “Run Stan” where we click the button for starting the Stan run (Figure 7). For this example, the Stan run as a whole (including the compilation of the C++ code) takes about 50 seconds on a standard desktop machine.

After Stan has finished sampling, we receive a pop-up notification (not shown here) whether all MCMC diagnostics have passed their checks. Here, this is the case as we may also see on tab “MCMC diagnostics” (Figure 8).

Since shinybrms reports all MCMC diagnostics as being OK, we may start interpreting the posterior. On tab “Default summary” (Figure 9), it is mainly the summary of the population-level effects which is of interest here: With each additional year of age, the TWI is estimated to increase by ca. \(0.10\) with a \(95\,\%\) CrI of ca. \((-0.28, 0.48)\). An anticoagulation therapy is estimated to increase the TWI by ca. \(-1.12\) with a \(95\,\%\) CrI of ca. \((-7.53, 5.69)\). A diabetes disease is estimated to increase the TWI by ca. \(-0.56\) with a \(95\,\%\) CrI of ca. \((-6.51, 5.66)\). As may be seen from these three CrIs, the statistical uncertainty is quite big which is probably due to the small \(P = 10\).

Since we included an interaction between day and trt, the coefficients for these two variables are most conveniently interpreted by the help of a custom summary (Figure 10) and a conditional-effects plot (Figure 11). On tab “Custom summary” (Figure 10), we may calculate the estimated TWI difference (in means) between the CAP and the standard treatment separately for each day by entering the corresponding sum expressions (and the expression ‘b_trtCAP‘ for day 1) in turn. The resulting table is included in Figure 10: On day 1 (where the two wound areas had not been treated yet), the standard treatment and the CAP treatment lead to a quite similar TWI (the posterior median of their TWI difference being ca. \(-0.74\) with a \(95\,\%\) CrI of ca. \((-3.69, 2.13)\)). In contrast, on days 3, 5, and 7, the CAP treatment clearly leads to a lower TWI than the standard treatment (posterior medians of ca. \(-10.45\), \(-7.66\), and \(-7.10\), respectively, and \(95\,\%\) CrIs of ca. \((-13.43, -7.36)\), \((-10.63, -4.84)\), and \((-10.11, -4.03)\), respectively). This answers the primary research question: The CAP treatment indeed leads to a decreased TWI and therefore an improved wound healing compared to the standard treatment. This is also well illustrated by the conditional-effects plot (Figure 11). The conditional-effects plot also confirms that the TWI in the healthy skin area does not change as heavily over time as in the two wound areas.

Finally, we switch to tab “Launch shinystan”, enter a seed for the reproducibility of the PPCs (here, 63438), and click on the button for launching shinystan.

shinystan

Within shinystan, we may inspect some PPC plots, e.g., a kernel density estimate for the observed TWI values, overlaid by kernel density estimates for replicated TWI values (Figure 12). This overlaid density plot suggests that the model is appropriate, being able to generate outcome values similar to the observed ones after having estimated the unknown parameters by the help of the observed dataset (as well as the prior). Nevertheless, the model may still be improved, as illustrated by the PPC plots shown in Figure 13 (lower two histograms): The minimum of the observed TWI values is systematically smaller than the replicated minimums, the opposite holding—even if not that extremely—for the maximum. However, we consider the current model to be appropriate for the primary research question.

With respect to the parameter estimates, shinystan offers, e.g., a visualization of the posterior medians, together with \(50\,\%\) and \(95\,\%\) CrIs (Figure 14). The shinystan app also offers kernel density estimates for the univariate marginal posteriors (not shown).

6 Discussion

We have presented our shiny app called shinybrms, distributed as an R package. With the shinybrms GUI, we hope to make Bayesian regression modeling more accessible for people without any knowledge of R’s syntax. Currently, the user still needs to execute some R code for setting up shinybrms’s backend and for launching the shinybrms app, even if he or she is using a GUI for installing R packages. We tried to make this as easy as possible by providing step-by-step instructions in the README file of the shinybrms package. More importantly however, the shinybrms app may be hosted on a server and accessed through a web browser, just like any other shiny app. In that case, the user does not need to install the shinybrms package or any other additional software. With the server-sided hosting, the shinybrms app may even be accessed from a mobile device where it is usually impossible to install any software designed for personal computers. Of course, setting up the server-sided hosting is a lot more complex than following the instructions from our README file for running shinybrms on a local computer, but the idea is that IT departments of bigger institutions could establish the server-sided hosting (potentially adding an access control on top) and then members of that institution could access the shinybrms app through their web browsers.

Note that JASP offers an alternative host-client service by relying on rollApp (rollApp, Inc. 2020; rollApp, Inc. and JASP Team 2020).

Apart from application in practice, shinybrms may also be valuable for teaching Bayesian regression models, e.g., to undergraduate students.

Of course, shinybrms may still be extended. As may be seen from our real-world example in section 5, truncated outcome families would be a useful feature. Apart from this, our future plans also include further outcome families supported by brms (e.g., ordinal and time-to-event regression), model selection features (e.g., using the package projpred by (Piironen et al. 2022)), and support for special brms features such as smoothed effects and known measurement error in the outcome variable (needed for meta-analyses). When implementing new features, the challenge will be to keep the GUI as simple as possible: In our opinion, a GUI such as shinybrms should support the user by automizing steps wherever this is appropriate and thus focus the attention to steps which may not be automized (in particular those related to the original research question).

7 Supplementary Material

This article comes with an online Supplement which consists of the following files:

file Supplement_sections.pdf which is a document with the following sections:
- “Existing GUIs”,
- “Advanced distributional families”,
- “Frequentist analysis of the example”;
file CAP.csv which contains the dataset for section 5;
file weber_shinybrms.R which contains the R code for section 5;
file weber_shinybrms_sessionInfo.txt which contains the original computing environment information for section 5. Note that the reproducibility of Stan results depends on the machine’s hardware, so in general, our results from section 5 will not be perfectly reproducible on other machines.

This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.

We added quotation marks here since noninformative priors might be more informative than intended (Gelman et al. 2017).↩︎
Here, we are slightly abusing the notation by employing a single integral symbol for a possibly multiple integral.↩︎
Here, the term “proposal” refers to a proposed parameter vector.↩︎
Here, the term “symmetric” refers to the preservation of the distribution when reverting the jump, not to the symmetry in the shape of a distribution.↩︎
Population-level effects are also known as fixed effects (Bürkner 2017, 2018). Group-level effects are also known as random or partially pooled effects (Bürkner 2017, 2018; Goodrich et al. 2022). The terms “fixed” and “random” effects are not really appropriate in a Bayesian context: In a Bayesian model, all parameters have a prior distribution and may therefore be considered as random (Marchenko and Balov 2015).↩︎
In multilevel models, drawing the line between prior and likelihood can be done in multiple ways, so our mathematical formulation is just one of several possibilities.↩︎
The term “potential scale reduction factor” is not always appropriate (Vehtari et al. 2021 2), but because of its widespread use, we employ it here nonetheless.↩︎
For this demonstration here, we won’t discuss whether this decrease is clinically relevant.
↩︎