Despite their advantages, the application of Bayesian regression models is still the exception compared to frequentist regression models. Here, we present our R package *shinybrms* which provides a graphical user interface for fitting Bayesian regression models, with the frontend consisting of a *shiny* app and the backend relying on the R package *brms* which in turn relies on Stan. With *shinybrms*, we hope that Bayesian regression models (and regression models in general) will become more popular in applied research, data analyses, and teaching. Here, we illustrate our graphical user interface by the help of an example from medical research.

The relevance of regression models in applied research has already been well pointed out, for example, by Karabatsos (2015):

“Regression modeling is ubiquitous in empirical areas of scientific research. This is because most research questions can be asked in terms of how a dependent variable changes as a function of one or more covariates (predictors).”

Conducting regression analyses in a Bayesian framework has a lot of advantages. Introductory texts on Bayesian statistics (in general) are, e.g., McElreath (2020), Albert and Hu (2019), Reich and Ghosh (2019), StataCorp (2019a), Gelman et al. (2020a), and Johnson et al. (2022). For readers with little background in Bayesian statistics, we recommend reading one of these textbooks first. A more detailed introduction may be found, e.g., in Gelman et al. (2014). In particular, Bayesian statistics has the following advantages (for further advantages, see, e.g., Gelman et al. 2014; StataCorp 2019a):

Bayesian methods allow for incorporation of prior knowledge. Generally, inclusion of prior knowledge is desirable: The flat prior implied by the frequentist maximum likelihood (ML) method may lead to nonsensical inferences (Gelman et al. 2017). Even if the inclusion of informative prior knowledge is not desired, weakly informative priors have the advantage (compared to so-called “noninformative” priors

^{1}) to downweight unreasonable parameter values and to introduce a certain regularization or penalization, helping against overfitting (Gelman 2006; Gelman et al. 2008, 2014, 2017). Hereafter, we follow conventional notation in Bayesian statistics and denote the prior for the parameter vector \(\boldsymbol{\theta}\) by \(p(\boldsymbol{\theta})\).Similarly, the prior distribution may be used to impose parameter constraints in an easy and natural way. There is no need for

*ad-hoc*solutions to cut off parameter estimates. For example, many frequentist between-study variance estimators in the random-effects meta-analysis model are cut off at zero.It is usually possible to infer the posterior exactly (apart from minor approximations such as those arising from the Monte Carlo error). In that case, Bayesian statistics does not need to resort to large-sample approximations such as the asymptotic normal distribution of the ML estimator often used in frequentist statistics.

For most practical cases, Markov chain Monte Carlo (MCMC) sampling (see section 2.1) constitutes a generic Bayesian inference method. In frequentist statistics, generic methods such as the asymptotic normal distribution of the ML estimator can be unsatisfactory, e.g., for small sample sizes. This is why in frequentist statistics, different inferential methods have often evolved for the same task or model. This complicates frequentist analyses for users, especially for those with little background in statistics.

The quantities derived from the posterior have a more intuitive interpretation than their frequentist counterparts which are based on the sampling distribution of the estimator. In particular, Bayesian posterior intervals (credible intervals, CrIs) have the interpretation that is often incorrectly attributed to frequentist confidence intervals (CIs) (McElreath 2020) and posterior tail-area probabilities have the interpretation that is often incorrectly attributed to frequentist \(p\)-values.

In Bayesian statistics, uncertainty in nuisance parameters is easily—and naturally—taken into account by integrating them out from the posterior: \[p(\boldsymbol{\psi} | \boldsymbol{\mathcal{D}}) = \int p(\boldsymbol{\psi}, \boldsymbol{\phi} | \boldsymbol{\mathcal{D}}) \; \text{d}\boldsymbol{\phi} \label{eqn:nuisance} \tag{1}\] with \(\boldsymbol{\mathcal{D}}\) denoting the data, \(\boldsymbol{\psi}\) the parameter vector of interest, and \(\boldsymbol{\phi}\) the nuisance parameter vector (so that \((\boldsymbol{\psi}^{\scriptscriptstyle \mathsf{T}}, \boldsymbol{\phi}^{\scriptscriptstyle \mathsf{T}})^{\scriptscriptstyle \mathsf{T}}= \boldsymbol{\theta}\))

^{2}. Taking uncertainty in nuisance parameters into account helps against overfitting (like the penalization mentioned in enumeration point number 1 above), but in general, this is not that easy in frequentist statistics, as can be seen in random-effects meta-analyses (Weber et al. 2021).When combined with probabilistic programming (as done, e.g., by the various sampling methods introduced in section 2), a Bayesian analysis naturally propagates the posterior uncertainty into derived quantities (Gelman et al. 2020a).

Often, frequentist analyses result in the typical null-hypothesis significance testing which is being criticized to an increasing degree (Amrhein and Greenland 2018; Amrhein et al. 2019; McShane et al. 2019). Null-hypothesis significance testing is especially problematic for null hypotheses consisting of only a point in parameter space. Of course, Bayesian analyses may also result in null-hypothesis significance testing or similar hypothesis-testing procedures, but in our experience, this is not as common as in frequentist analyses. We designed our software presented here in a way that does not encourage null-hypothesis significance testing.

Posterior predictive checks (PPCs)—which should be part of a Bayesian workflow (Gelman et al. 2020b)—are an easy and intuitive way of performing model diagnostics in a Bayesian framework, even though the choice and interpretation of the PPCs require some experience (Gelman et al. 2020b). In a frequentist framework, model diagnostics are often not that easy to perform, at least if the uncertainty from parameter estimation should be taken into account.

These advantages will be illustrated in the context of the example from
section 5, by comparing our Bayesian analysis to a frequentist
one (see section “Frequentist analysis of the example” in the online
Supplement file `Supplement_sections.pdf`

).

Despite the aforementioned advantages, Bayesian methods—and Bayesian
regression models (BRMs) in particular—are still not as common as
their frequentist counterparts. In 2005, Woodward (2005) supposed
one reason to be the lack of a “good user interface” which would allow
applied researchers to fit BRMs as conveniently as other statistical
methods for which a graphical user interface (GUI) already exists. In
the meantime, several GUIs have emerged (see section 3), but to
our knowledge, until the first release of our R package *shinybrms*
(Weber 2022), there was no GUI which used Stan
(Carpenter et al. 2017; Stan Development Team 2022c) for inferring
the posterior in BRMs. Stan has several advantages compared to other
methods for inferring the posterior. In particular, it is highly
flexible with respect to modeling choices and very efficient. Details
will be given in section 2.

Our *shinybrms* package is noncommercial and available at the
Comprehensive R Archive Network (CRAN). While *shinybrms*’s frontend is
a *shiny* (Chang et al. 2022) app, *shinybrms*’s backend completely
relies on *brms* (Bürkner 2017, 2018) which
itself relies on Stan. Both of *brms*’s backends (i.e., interfaces to
Stan), namely *rstan*
(Stan Development Team 2022a) and *cmdstanr*
(Gabry and Češnovar 2022), are supported by *shinybrms*. For the inspection
of the Stan output, the
*shinystan*
(Gabry 2022) app may be launched from within *shinybrms*.

To explain the particular advantages of Stan in detail, we have to take
a closer look at different ways for inferring the posterior. This is the
purpose of section 2. In section 3, we summarize
existing GUIs for BRMs. That section is partly influenced by
Ramírez-Hassan and Graciano-Londoño (2021). In section 4, we present the
features of *shinybrms*. The usage of the *shinybrms* app is illustrated
by the help of a real-world example in section 5. Finally, we
discuss our work in section 6.

As mentioned above, in Bayesian statistics, uncertainty arising from the estimation of nuisance parameters is taken into account by integrating them out from the posterior. This is not the only integration occurring in posterior inference: Basically every quantity derived from the posterior is somehow connected to an integration over the posterior. However, it is the integration which also causes a lot of complications. While it is most desirable to perform posterior inference by exact calculation of the desired integrals (using analytic expressions), this approach is often infeasible and even if it is feasible, it has the downside of being not as flexible as other approaches since it needs to be tailored to the statistical model at hand. Numerical integration (e.g., by quadrature) may seem like a remedy, but is often only feasible up to a limited dimensionality of the parameter space. Depending on the algorithm, numerical integration may also introduce tuning quantities, hindering its “out-of-the-box” usage. Integration by simple Monte Carlo (MC) sampling may seem like an alternative, but this is only possible for distributions one may directly sample from (e.g., a Gaussian distribution).

With the advent of Markov chain Monte Carlo (MCMC) methods, Bayesian
inference has changed a lot (Woodward 2005; Lunn et al. 2009).
The first MCMC algorithm was the Metropolis algorithm
(Metropolis et al. 1953) which starts from an initial point in
parameter space and iteratively samples a proposal^{3} from a
symmetric^{4} jumping distribution and accepts the proposal with a
certain acceptance probability which depends on the ratio of the target
(here, the posterior) density at the current position and at the
proposal. The Metropolis-Hastings (MH) algorithm
(Metropolis et al. 1953; Hastings 1970) generalizes the
Metropolis algorithm to asymmetric jumping distributions. Gibbs sampling
(Geman and Geman 1984; Gelfand and Smith 1990) consists of
alternately sampling from the full conditional posterior distributions
and is a special MH algorithm in which the proposal is always accepted
(Gelman et al. 2014). Combinations of the aforementioned algorithms
are also widely used, e.g., MH-within-Gibbs. All MCMC algorithms
(including those mentioned hereafter) require a careful examination of
the convergence of the Markov chains. The MCMC diagnostics used for this
purpose in *shinybrms* are outlined in section 4.4.2.

Hamiltonian Monte Carlo (HMC) (initial work and major contributions by Duane et al. 1987; Neal 1993, 2011; MacKay 2003) is a special MCMC algorithm which is often more efficient than other MCMC algorithms, especially in case of a high-dimensional posterior distribution and correlated parameters (Hoffman and Gelman 2014; Betancourt 2018). The efficiency of HMC is due to the fact that it takes advantage of the gradient of the (log) posterior density (Stan Development Team 2022d), making it a combination of stochastic and deterministic procedures (which explains why HMC is also known as hybrid Monte Carlo) (Gelman et al. 2014). HMC provides helpful diagnostics, such as divergent transitions which can (but must not necessarily) indicate areas of the posterior which are hard to explore by the HMC sampler (Betancourt 2018; Gabry et al. 2019). Compared to Gibbs sampling, HMC also has the advantage that nonconjugate priors may be used easily. For the original HMC algorithm, three tuning quantities need to be specified by hand in advance: the mass matrix \(\boldsymbol{M}\) (which is the covariance matrix of the auxiliary momentum vector), the number \(L\) of leapfrog steps, and the size \(\epsilon\) of the leapfrog steps (Gelman et al. 2014; Stan Development Team 2022d).

Because of the fixed choice of \(L\), the original HMC algorithm is a static HMC algorithm (Betancourt 2018). In contrast, the no-U-turn sampler (NUTS) (Hoffman and Gelman 2014) is a dynamic HMC algorithm since it automatically chooses a (possibly) new value of \(L\) in each iteration of each Markov chain. Hoffman and Gelman (2014) also proposed a new dual averaging technique for determining \(\epsilon\) automatically, too. Apart from these automations, the NUTS has the advantage that in terms of efficiency, it was shown to perform as well as—or even better than—a well-tuned static HMC algorithm (Hoffman and Gelman 2014). A modified (Betancourt 2018) NUTS is implemented in Stan. Stan’s NUTS also includes an automatic adaptation of the mass matrix \(\boldsymbol{M}\) during the warmup phase (Stan Development Team 2022d). A complete presentation of Stan’s NUTS is out of the scope of this article. A good starting point for a detailed description is Stan Development Team (2022d) as well as Betancourt (2018). Note that Stan also includes other algorithms for inferring or approximating the posterior. In this paper however, we only refer to Stan’s NUTS when referring to Stan.

Table 1 summarizes existing GUIs for BRMs. Details are
provided in Supplement section “Existing GUIs”. Table 1
makes it clear that none of the existing GUIs relies *entirely* on Stan
or the NUTS.

Algorithm (for inferring the posterior) | |||||||||

Non-MCMC | MCMC | ||||||||

GUI name | Commercial | Analytic | Numerical | MC | BB | Non-HMC | Static HMC | NUTS | Algorithm choice |

WinBUGS (Lunn et al. 2000) | no | no | no | no | no | yes | no | no | no |

OpenBUGS (Spiegelhalter et al. 2014) | no | no | no | no | no | yes | yes | no | no |

BugsXLA (Woodward 2011) | no | no | no | no | no | yes | yes | no | no |

IBM SPSS Amos (Arbuckle 2020) | yes | no | no | no | no | yes | yes | no | yes |

TEET (Qian 2011) | no^{(i)} |
yes | yes | yes | no | yes | no | no | no |

JASP (JASP Team 2022) | no | yes | yes | yes | no | yes | no | yes^{(ii)} |
no |

BRNPM (Karabatsos 2015; Karabatsos 2017) | no | no | no | no | no | yes | no | no | no |

Stata (StataCorp 2019b) | yes | no | no | no | no | yes | no | no | yes |

BayES (Emvalomatis 2020) | no | no | no | no | no | yes | no | no | no |

IBM SPSS (IBM Corp. 2020) | yes | yes | yes | yes | no | no | no | no | no |

BEsmarter ((BEsmarter Team 2020a,b); | no | no | no | no | yes | yes | no | no | yes^{(iii)} |

(Ramírez-Hassan and Graciano-Londoño 2021)) |

JASP does use Stan for some analyses, but JASP’s concept is quite
different from *shinybrms*’s concept: While JASP offers a plenty of
different statistical methods (including non-regression analyses),
*shinybrms* is designed to be as concise as possible. While JASP’s
approach of using Stan for only some analyses certainly has a few
advantages (especially in terms of runtime), *shinybrms*’s approach of
completely relying on Stan (and *brms* in particular) has the advantage
of a better maintainability: *shinybrms* only provides a lightweight GUI
and only needs to perform few computations on its own. This is due to
*brms* which is very flexible by allowing to fit a variety of regression
models within a single R package. This division of work between
*shinybrms*, *shiny*, *brms*, *rstan*, Stan, and *shinystan* reduces the
amount of maintenance necessary for *shinybrms*, resulting in a faster
integration of new features, a faster elimination of bugs, and a longer
life cycle. Furthermore, it allows the authors of each component to
focus on their strengths.

The following general presentation of *shinybrms*’s features will be in
written form, but with links to the corresponding screenshots from the
example in section 5. In this article, not all aspects of the
*shinybrms* app are shown in screenshots. For more screenshots, see the
*shinybrms* website (Weber 2022).

Note that the mathematical formulation of the models which may be fit
with *shinybrms* has already been given elsewhere
(Bürkner 2017, 2018), so we will keep it short
here.

The *shinybrms* app has three main pages which are accessible from a
navigation bar at the top (Figure 1): “Likelihood”,
“Prior”, and “Posterior”.

This structure follows Bayes’ theorem, simplified to the proportionality of the posterior density to the product of prior density and likelihood: \[p(\boldsymbol{\theta} | \boldsymbol{\mathcal{D}}) \propto p(\boldsymbol{\theta}) \cdot p(\boldsymbol{\dot{\mathcal{D}}} | \boldsymbol{\theta}, \boldsymbol{\ddot{\mathcal{D}}}) \label{eqn:bayes} \tag{2}\] where we have split up the data \(\boldsymbol{\mathcal{D}}\) into \(\boldsymbol{\mathcal{D}} = \begin{bmatrix}\boldsymbol{\dot{\mathcal{D}}} \;\; \boldsymbol{\ddot{\mathcal{D}}}\end{bmatrix}\) because in BRMs, the distribution in the likelihood typically conditions on the predictor part \(\boldsymbol{\ddot{\mathcal{D}}}\) of the data (see section 4.2.2 below). In the following sections, these three main pages will be described in detail.

There are also some auxiliary pages, the first two having direct links in the navigation bar, the last three being accessible from the drop-down menu “More” at the end of the navigation bar:

- The starting page “Home” gives a short overview of
*shinybrms*’s objective and structure. Thus, it only contains informational text and no interactive elements. - On page “Data”, the user uploads his or her custom dataset which
shall be used for the regression analysis. For testing purposes,
page “Data” also offers example datasets. The chosen dataset (no
matter if it was uploaded or chosen from the list of example
datasets) is automatically shown in a preview consisting of the
dataset’s first six rows (there is an option to show the full
dataset, though). It is also possible to show the output of R’s
`str()`

function applied to the chosen dataset, which gives some basic information about the dataset and its variables for users familiar with R. - Page “About” contains basic information about
*shinybrms*(e.g., version and corresponding date) as well as some legal information. - Page “Links” gives links to software relevant for the
*shinybrms*app. - Page “References” contains the references for literature cited throughout the app.

Page “Likelihood” has three tabs: “Outcome”, “Predictors”, and “Formula preview”. These tabs will now be described in turn.

On tab “Outcome” (Figure 2), the user specifies the outcome variable \(\boldsymbol{y} = (y_1, \dotsc, y_N)^{\scriptscriptstyle \mathsf{T}}\in \mathbb{R}^N\) (by choosing it from a drop-down list of the variables present in the dataset) as well as its distributional family, i.e., the basic form of the likelihood \(p(\boldsymbol{\dot{\mathcal{D}}} | \boldsymbol{\theta}, \boldsymbol{\ddot{\mathcal{D}}})\), now with \(\boldsymbol{\dot{\mathcal{D}}} = \boldsymbol{y}\). For the distributional family, there is a drop-down menu and a checkbox called “Show advanced distributional families”. By default, this checkbox is unchecked which means that the drop-down menu offers three general distributional families of broad practical relevance: the Gaussian family (with the identity link function), the Bernoulli family with the logit link function, and the negative binomial family with the log link function. This is intended to be a limited selection: By reducing the choices as much as possible, we want to avoid overwhelming the user with a variety of special distributions. For example, the Poisson family is intentionally left out, in favor of the more general negative binomial distribution. However, by checking the “Show advanced distributional families” checkbox, the drop-down menu is extended so that a variety of other distributional families can be selected as well (see Supplement section “Advanced distributional families”).

If desired, the user may specify predictors on tab “Predictors” (Figure 3). We use the term “predictor” for a column in the model matrix. In contrast, we use the term “predictor variable” for a column in the input dataset. Thus, a predictor may also denote an interaction and a predictor variable with \(K\) categories leads to \(K - 1\) predictors (due to dummy coding).

The *shinybrms* app supports population-level effects as well as
group-level effects^{5}. The inclusion of group-level effects yields a
multilevel model (also known as hierarchical or mixed-effects model).
Here, we denote the vector of population-level effects
by \(\boldsymbol{\beta}\) and the corresponding model matrix
by \(\boldsymbol{X}\). Likewise, we denote the vector of group-level
effects by \(\boldsymbol{u}\) and the corresponding model matrix
by \(\boldsymbol{Z}\). The hyperparameters for the group-level effects
(i.e., their standard deviations and correlations) will be collected in
a vector \(\boldsymbol{\tau}\). If the model does not contain group-level
effects, we define here (for the mathematical description, not for the
software) \(\boldsymbol{u} = 0\),
\(\boldsymbol{Z} = (0, \dotsc, 0)^{\scriptscriptstyle \mathsf{T}}\in \mathbb{R}^N\)
(for example; the exact values in \(\boldsymbol{Z}\) do not matter if
\(\boldsymbol{u} = 0\)), and \(\boldsymbol{\tau} = 0\) (for example). With
\(\boldsymbol{X}\) and \(\boldsymbol{Z}\), we now have
\(\boldsymbol{\ddot{\mathcal{D}}} = \begin{bmatrix}\boldsymbol{X} \;\; \boldsymbol{Z}\end{bmatrix}\).
We denote the vector of linear predictors by \(\boldsymbol{\eta} = \boldsymbol{X} \boldsymbol{\beta} + \boldsymbol{Z} \boldsymbol{u} \in \mathbb{R}^N\)
(written out as
\(\boldsymbol{\eta} = (\eta_1, \dotsc, \eta_N)^{\scriptscriptstyle \mathsf{T}}\)).

Note that here as well as in *shinybrms*, the term “interaction” is also
used for interactions involving predictor variables with group-level
main effects (yielding group-level interaction effects). This broad
definition of “interaction” simplifies the GUI and emphasizes the key
concept of interactions, namely that an effect depends on another
predictor (or on other predictors).

To avoid common mistakes, *shinybrms* imposes some restrictions:
Firstly, an overall (population-level) intercept is always included.
Secondly, including an interaction causes all corresponding lower-order
interactions to be automatically included, too. The latter restriction
also implies that interactions may only involve predictor variables for
which main effects have already been added.

The tab “Formula preview” simply combines the chosen outcome and the
chosen predictors into *brms*’s formula syntax. This is mainly intended
for checking the correct specification of the model formula (for users
familiar with the syntax). However, it also provides a concise and
standardized way to communicate this central part of the model because
together with the distributional family, this model formula determines
the likelihood
\(p(\boldsymbol{\dot{\mathcal{D}}} | \boldsymbol{\theta}, \boldsymbol{\ddot{\mathcal{D}}}) = p(\boldsymbol{y} | \boldsymbol{\theta}, \boldsymbol{X}, \boldsymbol{Z})\):
In case of the Gaussian family (with the identity link function), we
have
\[p(\boldsymbol{y} | \boldsymbol{\theta}, \boldsymbol{X}, \boldsymbol{Z}) =
\prod_{i = 1}^{N} \frac{1}{\sqrt{2 \pi} \cdot \sigma}
\,\exp\!\left(-\frac{1}{2} \cdot \left(\frac{y_i - \mu_i}{\sigma}\right)^2\right)
\label{eqn:gauss} \tag{3}\]
with \(\mu_i = \eta_i\) (see \(\boldsymbol{\eta}\) from
section 4.2.2), \(\sigma \in (0, \infty)\), and
\(\boldsymbol{\theta} = (\boldsymbol{\beta}^{\scriptscriptstyle \mathsf{T}}, \boldsymbol{u}^{\scriptscriptstyle \mathsf{T}}, \boldsymbol{\tau}^{\scriptscriptstyle \mathsf{T}}, \sigma)^{\scriptscriptstyle \mathsf{T}}\).
Note that the dependence on \(\boldsymbol{X}\) and \(\boldsymbol{Z}\) as
well as on most elements of \(\boldsymbol{\theta}\) is an indirect
dependence *via* \(\mu_1, \dotsc, \mu_N\). In case of the Bernoulli family
with the logit link function, we have \(\boldsymbol{y} \in \{0, 1\}^N\)
and
\[p(\boldsymbol{y} | \boldsymbol{\theta}, \boldsymbol{X}, \boldsymbol{Z}) =
\prod_{i = 1}^{N} \mu_i^{y_i} (1 - \mu_i)^{(1 - y_i)}
\label{eqn:bernoulli} \tag{4}\]
with \(\mu_i = \frac{1}{1 + \exp(-\eta_i)}\) and \(\boldsymbol{\theta} = (\boldsymbol{\beta}^{\scriptscriptstyle \mathsf{T}}, \boldsymbol{u}^{\scriptscriptstyle \mathsf{T}}, \boldsymbol{\tau}^{\scriptscriptstyle \mathsf{T}})^{\scriptscriptstyle \mathsf{T}}\).
In case of the negative binomial family with the log link function, we
have \(\boldsymbol{y} \in \left(\{0\} \cup \mathbb{N}\right)^N\) and
\[p(\boldsymbol{y} | \boldsymbol{\theta}, \boldsymbol{X}, \boldsymbol{Z}) =
\prod_{i = 1}^{N} \binom{y_i + \zeta - 1}{y_i}
\left(\frac{\mu_i}{\mu_i + \zeta}\right)^{y_i}
\left(\frac{\zeta}{\mu_i + \zeta}\right)^{\zeta}
\label{eqn:negbinom} \tag{5}\]
with \(\mu_i = \exp(\eta_i)\), \(\zeta \in (0, \infty)\), and
\(\boldsymbol{\theta} = (\boldsymbol{\beta}^{\scriptscriptstyle \mathsf{T}}, \boldsymbol{u}^{\scriptscriptstyle \mathsf{T}}, \boldsymbol{\tau}^{\scriptscriptstyle \mathsf{T}}, \zeta)^{\scriptscriptstyle \mathsf{T}}\).
The mathematical details of the “advanced” distributional families (see
section 4.2.1 above) may be found in the *brms* vignette
“Parameterization of Response Distributions in brms”.

At the top of page “Prior”, the user obtains a preview of the default
priors taken from *brms* (Figure 4). At the bottom,
the user may specify custom priors (Figure 5). Both,
the default and the custom priors, refer to the parameters of the
currently specified likelihood. Custom priors may be specified as
follows:

*via*a Stan function,*via*one of the special*brms*(pseudo-)functions designed for this purpose, e.g., for the Lewandowski-Kurowicka-Joe (LKJ) prior (Lewandowski et al. 2009),*via*an empty input field to specify a flat prior over the whole support of the corresponding parameter(s).

The first two possibilities always lead to a proper prior. The flat prior is only proper if the support is bounded on both sides. Otherwise (which is the more common case), the flat prior is improper.

The user’s selections on page “Prior” ultimately lead to the
specification of \(p(\boldsymbol{\theta})\). Together with the likelihood,
this completes the model specification. Note that for multilevel models,
\(p(\boldsymbol{\theta})\) here^{6} includes the distributions of the
group-level effects \(\boldsymbol{u}\) as well, even though they do not
need to be specified on page “Prior”.

Page “Posterior” has six tabs: “Run Stan”, “MCMC diagnostics”, “Default
summary”, “Custom summary”, “Conditional effects”, and “Launch
*shinystan*”. These will now be described in turn. The output shown on
these tabs may also be downloaded (with the file format depending on the
specific type of output).

At the top of tab “Run Stan”, the user may inspect and download the Stan
code and the so-called Stan data. The Stan data basically consists of
the pre-processed part of the chosen dataset which is needed for the
Stan model, extended by some internal objects. Apart from checking or
documentation purposes, the Stan code and the Stan data are needed if
the user wants to customize the Stan code and then run Stan outside of
*shinybrms*.

Further down on tab “Run Stan”, the user may set advanced options for the Stan run (Figure 6). These options have sensible defaults, but sometimes they need to be changed. Probably the most important option is the seed for the pseudorandom number generator.

The final panel on tab “Run Stan” is the central one
(Figure 7): By a click on the “Run Stan” button,
Stan translates the Stan code written by *brms* to C++ code, compiles
this C++ code, and then starts sampling. By default, Stan writes its
sampling progress to an HTML file which is automatically opened up by
*shinybrms*. The user then only needs to refresh this HTML file to see
the current sampling progress. An example for Stan’s runtime will be
given in section 5.

When Stan has finished sampling, the panel “Run Stan” automatically
refreshes, in particular to show the result from an overall check of the
MCMC diagnostics (see section 4.4.2 for details). The user
also has the possibility to download different output objects which can
be analyzed outside of *shinybrms* or—in case of the fitted model
object of class `"brmsfit"`

—uploaded in a new *shinybrms* session (to
avoid re-running Stan).

On tab “MCMC diagnostics” (Figure 8), the user obtains detailed information concerning the following MCMC diagnostics:

- HMC-specific diagnostics:
- the number of iterations ending with a divergence,
- the number of iterations hitting the maximum tree depth,
- the Bayesian fraction of missing information for the energy transitions (E-BFMI);

- some general MCMC diagnostics (which are computed for each parameter
as well as for the accumulated log posterior density):
- the modified potential scale reduction factor \(\widehat{R}\)
proposed by Vehtari et al. (2021) (here simply called
*the*\(\widehat{R}\) instead of*the modified*\(\widehat{R}\))^{7}, - the effective sample size (ESS) in the bulk of the corresponding marginal posterior (Vehtari et al. 2021),
- the ESS in the tails of the corresponding marginal posterior (Vehtari et al. 2021).

- the modified potential scale reduction factor \(\widehat{R}\)
proposed by Vehtari et al. (2021) (here simply called

As a full description of these MCMC diagnostics is out of the scope of
this article, we refer the interested reader to
Stan Development Team (2022b), Betancourt (2018), and
Vehtari et al. (2021). The most important basic guidelines
for deciding whether these MCMC diagnostics are worrying are explained
in the *shinybrms* GUI and also checked automatically by *shinybrms*.
For the general MCMC diagnostics, it is also possible to show a detailed
table with the diagnostics for each parameter (as well as for the
accumulated log posterior density).

Tab “Default summary” (Figure 9) shows *brms*’s
standard robust summary of the posterior inference, e.g., the medians
and the central \(95\,\%\) intervals of the marginal posteriors (the
\(95\,\%\) CrIs) of the most important parameters. This tab is only
intended for a quick inspection. A much more comprehensive analysis of
the Stan output is offered by the *shinystan* app (see
section 4.4.6).

On tab “Custom summary” (Figure 10), the user may calculate posterior summary quantities for a custom mathematical (or logical) expression involving at least one parameter. Such an expression may be, e.g., a sum of two parameters (as shown in Figure 10) or the event that a parameter exceeds a certain threshold.

On tab “Conditional effects” (Figure 11), *shinybrms*
offers conditional-effects plots (created by
`brms::conditional_effects()`

). A conditional-effects plot shows the
estimated effect of a predictor variable on the outcome. An interaction
effect involving at most two predictor variables may also be visualized
by showing the estimated effect of the first predictor variable
separately for appropriate values of the second predictor variable.

As described in more detail in the *shinybrms* GUI, a
conditional-effects plot *conditions* on specific values of those
predictor variables which are not involved in the plot. Likewise,
group-level effects which are not involved in the plot are (usually) set
to zero.

The *shinystan* app (Gabry 2022) offers an interactive
inspection of Stan (and other MCMC-generated) results, in particular
with respect to:

- MCMC diagnostics (including several additional diagnostics not
covered by
*shinybrms*’s tab “MCMC diagnostics”), - PPCs,
- summary quantities of univariate marginal posteriors,
- plots of univariate, bivariate, and trivariate marginal posteriors.

Before launching the *shinystan* app from within the *shinybrms* app by
clicking the corresponding button on tab “Launch *shinystan*”, the user
may set a seed to ensure the reproducibility of the PPCs.

At this point, the *shinybrms* workflow ends and passes over to the
*shinystan* workflow. We will illustrate the *shinystan* workflow in
section 5.

We illustrate *shinybrms*’s features following the workflow implied by
Figure 1 and using a real-world dermatological
dataset from Van Welzen et al. (2021). This dataset is available in the
Supplement (file `CAP.csv`

). In Supplement section “Frequentist analysis
of the example”, we compare the Bayesian analysis presented here with a
frequentist one, referring to the list of advantages of Bayesian
statistics from section 1.

Van Welzen et al. (2021) conducted a prospective pilot study investigating
the efficacy and safety of a novel cold atmospheric plasma (CAP) wound
dressing for the healing of split-skin graft donor sites. The only
outcome we focus on here is the tissue water index (TWI), measured by a
hyperspectral imaging camera and having values between 0 and 100, with
lower TWI values being associated with an improved wound healing.
Briefly, the study design was as follows: For each of \(P = 10\) patients,
the TWI was measured under \(T = 3\) different treatment conditions
(standard-treated wound, CAP-treated wound, and healthy skin). Each
treatment condition was investigated in its own skin area with \(R = 3\)
measurements across that area. This procedure of measuring was repeated
on each of \(D = 4\) days (day 1, 3, 5, and 7, with day 1 being the day of
the split-skin graft donation where the TWI was measured *after* the
split-skin graft donation but *before* the first wound dressing). Thus,
the dataset consists of \(N = P \cdot T \cdot R \cdot D = 360\) observations (rows). The dataset’s columns are:

`patID`

(for “patient ID”; coded as`"pat1"`

, …,`"pat10"`

),`age`

(in years),`anticoagulation`

(indicating whether the patient received an anticoagulation therapy before and during the study; coded as`"no"`

and`"yes"`

),`diabetes`

(indicating whether the patient is diabetic; coded as`"no"`

and`"yes"`

),`day`

(coded as`"d1"`

,`"d3"`

,`"d5"`

, and`"d7"`

),`trt`

(coded as`"0_standard"`

,`"CAP"`

, and`"healthy"`

to make`"0_standard"`

the reference level),`TWI`

(integers in the interval \([0, 100]\)).

The primary research question is whether the CAP treatment leads to a
decreased^{8} TWI compared to the standard treatment (polyhexanide wound
gel with fatty gauze). The healthy skin area serves as an *experimental*
control (albeit not as a control *treatment* since this is the role of
the standard-treated wound area) and is not part of the primary research
question.

After launching the *shinybrms* app in R *via*

```
> library("shinybrms")
> launch_shinybrms(launch.browser = TRUE)
```

(with `launch.browser`

set to `TRUE`

to ensure that the app is opened up
in the default web browser), we switch to page “Data” where we upload
the dataset (not shown here).

Next, we head over to page “Likelihood”. On tab “Outcome”
(Figure 2), we choose the outcome variable `TWI`

and the
Gaussian family as the distributional family for this outcome. Clearly,
the TWI values cannot follow an unmodified Gaussian distribution since
they are bounded by \(0\) and \(100\), with the minimum of the observed TWI
values being indeed as low as \(9\) (the maximum being \(67\)). Thus, a
truncated Gaussian distribution might be more appropriate here. We will
come back to this later in section 6.

On tab “Predictors” (Figure 3), we choose `age`

,
`anticoagulation`

, `diabetes`

, `day`

, and `trt`

to have population-level
main effects and `patID`

to have group-level main effects (“random
intercepts”). Further down on tab “Predictors”, we add an interaction
between `day`

and `trt`

(not shown here). This interaction is included
because the TWI is supposed to show a stronger time-dependence in the
two wound areas than in the healthy skin area. Additionally, the TWI
difference (in means) between the standard and the CAP treatment might
change over time.

Now the likelihood is set up, so we can proceed with the prior. The
default priors (Figure 4) are reasonable, but suppose
we wanted a weakly informative Student-\(t\) prior with \(3\) degrees of
freedom, a location parameter of \(0\), and a scale parameter of \(30\) for
all regression coefficients. To add this custom prior, we choose
parameter class `b`

from the corresponding drop-down list shown in
Figure 5, enter `student_t(3, 0, 30)`

into the input
field entitled “Prior distribution”, and click the “Add prior” button.
After doing so, our Student-\(t\) prior is added to the preview table
(Figure 5, right-hand side). For all remaining
parameters for which we do not specify a custom prior, the corresponding
default prior will be used.