R package GofCens: Goodness-of-Fit Methods for Right-Censored Data

This paper presents the R package GofCens, which offers graphical tools and implements goodness-of-fit tests for right-censored data. The first part provides a thorough review of current methodologies for assessing goodness of fit in the presence of right-censored data. The subsequent sections present the main functions of GofCens and are illustrated by means of a right-censored sample from a log-normal distribution and a data set on NBA players’ mortality, which is included in the package.

Mireia Besalú (Department of Statistics and Operations Research, Biostatistics and Bioinformatics Group and Institute for Research and Innovation in Health (IRIS)) , Klaus Langohr (Department of Statistics and Operations Research, Biostatistics and Bioinformatics Group and Institute for Research and Innovation in Health (IRIS)) , Matilde Francisco (Department of Statistics and Operations Research and Biostatistics and Bioinformatics Group,) , Arnau Garcia-Fernández (Department of Statistics and Operations Research and Biostatistics and Bioinformatics Group) , Guadalupe Gómez Melis (Department of Statistics and Operations Research, Biostatistics and Bioinformatics Group and Institute for Research and Innovation in Health (IRIS))
2025-10-21

1 Introduction

Goodness-of-fit techniques are important to test the validity of parametric models and to ensure that the modeling assumptions hold true. Historically, goodness-of-fit tests have been developed for complete data, that is, when all the individual sample measurements have been observed. The Kolmogorov-Smirnov, Cramér-von Mises, and Anderson-Darling statistics are among the most commonly used goodness-of-fit tests. They are based on a measure of the discrepancy between the empirical and theoretical distribution functions.

In order to account for censored data, these statistics can be modified replacing the empirical distribution function by the product-limit estimator of the distribution function. Modifications of Kolmogorov-Smirnov statistics for censored or truncated data go back to Fleming et al. (1980) and Koziol and Byar (1975). Pioneer extensions of Cramér-von Mises and Anderson-Darling goodness-of-fit statistics to account for random right-censored data are found in Pettitt and Stephens (1976) and Koziol and Green (1976). Concerning chi-squared tests, Mihalko and Moore (1980) propose an extension to censored data, specifically for type II right-censored data.

When the theoretical distribution function is completely specified and the data are uncensored, the above tests are all distribution-free, with known distributions. However, this property no longer holds when data are censored or when the theoretical distribution function involves unknown parameters. Moreover, there is no general theory of asymptotic optimality in such cases (Lehmann and Romano 2005). In fact, any test can achieve high asymptotic power or perform uniformly well against local or contiguous alternatives, especially when the family of possible alternatives is extensive (Janssen 2000). Given these limitations, graphical techniques have become a standard and straightforward way to assess distributional assumptions. They allow for a more intuitive examination of model fit. In addition, graphical methods complement formal goodness-of-fit tests. They often provide insights that the tests alone may not reveal.

The most well-known plots for assessing goodness-of-fit are probability plots. These include the Probability–Probability (P–P) plot, which compares the theoretical and estimated cumulative distribution functions, and the Quantile–Quantile (Q–Q) plot, which compares theoretical and estimated quantiles. There are also two alternatives to these plots: the Stabilised Probability plot (Michael 1983), which transforms the axes of the P-P plot to approximately get the same variance in each plotted point, and the Empirically Rescaled plot (Waller and Turnbull 1992), which is very useful when there is a high percentage of random right-censored data.

Goodness-of-fit methods for complete data and right-censored observations are widely available. However, many of these methods are not implemented in R, and those that are tend to be scattered across different packages. Among the publicly available computational tools in R, the fitdistrplus package (Delignette-Muller and Dutang 2015; Delignette-Muller et al. 2025) provides the function fitdistcens(), which returns the result of the fit of any parametric distribution to a possibly right-censored data set. The function cdfcomp() of the same package can be used to graphically compare multiple fitted distributions with uncensored data. The probplot() function from the e1071 package (Meyer et al. 2024) generates probability plots for specified theoretical distribution. Additionally, the distChooseCensored() function from the EnvStats package (Millard 2013; Millard and Kowarik 2025) performs Shapiro-Wilk and Shapiro-Francia goodness-of-fit tests for normal, log-normal, and Gamma distributions, handling both complete and singly censored data (Royston 1993).

In this work, we present the R package GofCens (Langohr et al. 2025), which provides various analytical methods and graphical tools to assess the goodness of fit of parametric models for non-negative, right-censored lifetime data. These tools can also be applied to complete lifetime data. Right censoring is restricted to type I or random censoring, and non-informative censoring is assumed for the failure time. In what follows, we review the analytical and graphical goodness-of-fit techniques for complete and right-censored data implemented in GofCens. Thereafter, the main functions of the GofCens package are presented and applied to a real data set, respectively. We conclude the paper with some final remarks on the current state of the package and possible enhancements.

2 Methods

Let \(T\) denote the time to an event of interest, with distribution function \(F\). The GofCens package provides goodness-of-fit methods to assess whether a univariate sample from \(T\), either right-censored or complete, comes from a specified distribution family \(F_0(t; \theta)\), such as the Weibull, where \(\theta\) is a vector of unknown parameters. Formally, the null hypothesis in a goodness-of-fit test is given by \(H_0\!: F(t)=F_0(t;\theta)\). Specifically, the GofCens package provides implementations of well-known tests such as the Kolmogorov-Smirnov, Cramér-von Mises, and Anderson-Darling tests based on the empirical distribution function for complete data and their extensions for right-censored data. Additionally, GofCens includes a chi-squared-type test based on the squared differences between observed and expected counts using random cells, with an extension tailored for right-censored data. Recognizing that these tests may not always yield high power, GofCens complements them with a series of graphical tools to aid in selecting the most suitable parametric model.

Goodness-of-fit tests based on the empirical distribution function are typically developed under a fully specified null hypothesis \(H_0\!: F(t) = F_0(t; \theta^*)\), such as a Weibull distribution with known parameters (e.g., \(\alpha = 2\) and \(\beta = 1\)). They are less often formulated for the more general case \(H_0\!: F(t)=F_0(t;\theta)\) , where \(\theta\) is unknown. This distinction has important implications for how such tests are applied in practice. This situation is common when fitting empirical data to parametric distributions with unknown parameters. To address it, we replace the unknown parameter \(\theta\) with its maximum likelihood estimator \(\hat\theta\). Following the recommendations in Capasso et al. (2009), the implementation of the four tests —Kolmogorov-Smirnov, Cramér-von Mises, Anderson-Darling, and chi-squared tests— uses bootstrap techniques that involve the maximum-likelihood re-estimation of the unknown parameters \(\theta\) in each simulated sample as we describe in Subsection Bootstrap methods.

In what follows, we briefly outline the theory underlying these methods for testing \(H_0\!: F(t) = F_0(t; \theta^*)\) assuming that the data consist of a sample of \(n\) individuals, each subject to random right censoring. We denote by \(T_1, \dotsc, T_n\) independent random variables with a common distribution \(F(\cdot)\) and by \(C_1, \dotsc, C_n\) independent censoring random variables with a common distribution \(H(\cdot)\), independent of \(T_1, \dotsc, T_n\). The observed variables are defined as \(Y_i = \min(T_i, C_i)\) and \(\delta_i = \mathbf{1}\{T_i\leq C_i\},\, i = 1, \dotsc, n\).

Tests based on empirical distribution functions

In order to test \(H_0\!: F(t) = F_0(t; \theta^*)\) for all \(t\geq 0\), we can use statistics based on a distance between \(F_0(t; \theta^*)\) and \(\widehat{F}_n(t)\), the empirical distribution function for complete data, or either the Kaplan-Meier or the Nelson-Aalen estimator, if right-censored data are present. In both cases, large values of the distance indicate evidence against the hypothesized model.

Kolmogorov-Smirnov test

The Kolmogorov-Smirnov goodness-of-fit test is the most used analytical method to test goodness of fit when one is dealing with uncensored data. Fleming et al. (1980) published a modification of the Kolmogorov-Smirnov test in an attempt to obtain increased power when applied to uncensored data. They also generalized this test for arbitrarily right-censored data.

For complete data, the Kolmogorov-Smirnov statistic is defined as \(D_n = \sup_t |\widehat{F}_n(t)- F_0(t; \theta^*)|\), where \(\widehat{F}_n(t)\) is the empirical distribution function. To account for right-censored data, the Kolmogorov-Smirnov statistic is modified as \[\label{eq:KSstat} \widehat{D}_n = \sup_{0\leq t\leq t_m} |\widehat{F}_n(t)-F_0(t; \theta^*)| = \sup_{0\leq t\leq t_m} \left|\int_0^t \frac{\widehat{S}_n(t)S_0(s; \theta^*)}{\widehat{S}_n(s)}d\big[\widehat{\Lambda}_n(s)-\Lambda_0(s; \theta^*)\big]\right|. \tag{1}\] Here, \(\widehat{F}_n\), is replaced by \(\widehat{F}_n = 1 - \widehat{S}_n = 1 - e^{-\widehat{\Lambda}_n}\), being \(\widehat{\Lambda}_n\) the Nelson-Aalen estimator of the cumulative hazard function. In this case, \(S_0\) and \(\Lambda_0\) denote the survival and the cumulative hazard function of the hypothesized distribution \(F_0\), respectively, and \(t_m\) is the largest observed time in the sample.

The standardized modified Kolmogorov-Smirnov statistic for censored data converges in distribution to \(\sup_{0\leq t\leq F_0(t_m; \theta^*)} |B(F_0(t; \theta^*))|\), where \(B(t)\) is a Brownian bridge. According to the results in Koziol and Byar (1975), the asymptotic distribution of \(\widehat{D}_n\) is given by \[\label{eq:aprxKS} \lim_{n\rightarrow\infty} P(\sqrt{n}\widehat{D}_n\leq k) = \sum_{j = -\infty}^\infty (-1)^{j}e^{-2j^2k^2} P\Big(\Big|Z-2jk \sqrt{\frac{1-T}{T}} \Big| < k \sqrt{\frac{1}{T - T^2}}\Big), \tag{2}\] where \(Z\) is a standard normal random variable and \(T = F_0(t_m; \theta^*)\). An approximation for the \(p\) value is proposed by (Fleming et al. 1980), which is considered acceptable when the \(p\) value is less than 0.8 and excellent when it is less than 0.2.

Cramér-von Mises-type test

For uncensored data, the Cramér-von Mises statistic is given by \[M_n = n\int_{-\infty}^{+\infty}\big(\widehat{F}_n(t)-F_0(t; \theta^*)\big)^2dF_0(t).\]

For right-censored data, let \(Y_1, \dotsc, Y_{n_r}\) represent the \(n_r\) observed failure times, that is, \(\sum_{i = 1}^n \delta_i = n_r\). If we transform the order failure times \(Y_{(1)}, \dotsc, Y_{(n_r)}\) into Uniform\((0,1)\) random variables \(u_{(1)}, \dotsc, u_{(n_r)}\), defined as \(u_{(i)} = F_0(Y_{(i)};\theta^*)\), and adopt the convention that \(u_{(0)} = 0\) and \(u_{(n_r + 1)} = 1\), we can express the modified Cramér-von Mises statistic as \[\label{eq:CvMstat} \widehat{M}_n = n_r\sum_{j=1}^{n_r+1} \widehat{F}_n(u_{(j-1)})(u_{(j)}-u_{(j-1)}) \left(\widehat{F}_n(u_{(j-1)})-(u_{(j)}+u_{(j-1)})\right) + \frac{n_r}{3}, \tag{3}\] which can be easily computed. However, although the asymptotic distribution of \(\widehat{M}_n\) has been studied by Pettitt and Stephens (1976) and Koziol and Green (1976), its practical implementation is not straightforward.

Anderson-Darling test

The Anderson-Darling test statistic is a modification of \(M_n\) where the integrand is weighted by the inverse of \(F_0(t)(1 - F_0(t))\): \[A_n = n\int_{-\infty}^{+\infty}(\widehat{F}_n(t)-F_0(t; \theta^*))^2\frac{dF_0(t; \theta^*)}{F_0(t; \theta^*)(1-F_0(t; \theta^*)}.\]

For right-censored data, using the same notation as in the Cramér-von Mises statistic, the Anderson-Darling statistic is computed as \[\begin{aligned} \label{eq:ADstat} \begin{split} \widehat{A}_n = -n_r &+ n_r\sum_{j = 1}^{n_r}(\widehat{F}_n(u_{(j - 1)}) - 1)^2 \left[\log|1 - u_{(j-1)}| - \log|1 - u_{(j)}|\right]\\ &+ n_r\sum_{j=1}^{n_r-1}\widehat{F}_n^2(u_{(j)})\left[ \log |u_{(j+1)}| - \log |u_{(j)}| \right]-n_r\log|u_{(n)}|. \end{split} \end{aligned} \tag{4}\] As with the Cramér-von Mises statistic, the asymptotic distribution of \(\widehat{A}_n\) has been studied by Pettitt and Stephens (1976) but its practical implementation is not straightforward.

Chi-squared type tests

Kim (1993) introduced a general class of \(\chi^2\) goodness-of-fit statistics for randomly right-censored data, extending Pearson’s statistic where the cells are replaced by random cells.

Random cells are built as intervals in \(\mathbb{R}\) whose boundaries depend on some sample summaries. These boundaries are defined by a vector \(\varphi\) of \(r\) statistics, yielding intervals \(I_j(\varphi)=(a_{j-1}(\varphi),\,a_{j}(\varphi)]\), where \(j = 1, \dotsc, M\), and \(-\infty=a_0(\varphi) < \, a_1(\varphi) \, < \,\dotsb < \, a_M(\varphi) = \infty\). For instance, we might take the quantiles \(1/M,\, 2/M, \dotsc, 1\) as random cells boundaries \(a_1(\varphi), \dotsc, a_M(\varphi)\).

The observed frequency of cell \(j\), \(N_{nj}(\varphi) = n\int_{I_j(\varphi)} d\widehat{F}_n\), is calculated using either the empirical distribution function or 1 minus the Kaplan-Meier estimator of the survival function if there are right-censored data. The expected probability of cell \(j\) under the null hypothesis of a fully specified distribution, \(F_0(t; \theta^*)\), is denoted by \(p_j(\theta^*, \varphi_n)\).

The class of \(\chi^2\) statistics is defined as \[\label{eq:Chistat} T_n = V_n^\top(\theta^*, \varphi_n) \cdot K_n(\theta^*, \varphi_n)\cdot V_n(\theta^*,\varphi_n), \tag{5}\] where \(K_n(\theta^*,\varphi)\) is a symmetric, non-negative definite matrix. \(K_n(\theta^*,\varphi)\) is, in most cases, the inverse of a consistent estimator of the asymptotic variance-covariance matrix of the random vector \(V_n(\theta^*,\varphi_n)\). The matrix can be computed following formulas (3.1) in (Kim 1993). The vector \(V_n(\theta^*,\varphi) = (\nu_{nj})_{j = 1, \dotsc, M}\), contains differences between observed and expected frequencies. Each component is defined as \[\nu_{nj}(\theta^*, \varphi_n) = n^{-\frac12}N_{nj}(\varphi_n) - n^{\frac12} \cdot p_j(\theta^*, \varphi_n).\] The limiting distribution of \(T_n\) is a linear combination of independent noncentral \(\chi^2\)-distributed variables which is complex to implement.

Computation of test statistic’s distribution

The limiting distributions of the statistics in (3), (4), and (5) under the null hypothesis \(H_0\!: F(t) = F_0(t; \theta),\) when the parameter \(\theta\) is unknown and replaced by its maximum likelihood estimate \(\hat\theta\), are too complex to implement directly. Therefore, the corresponding \(p\) values are computed using bootstrap methods (see Subsubsection Bootstrap methods for right-censored data). For consistency, the \(p\) values of the Kolmogorov-Smirnov test are also obtained using bootstrap.

The function KScens() of the GofCens package calculates the value of the test statistic of the Kolmogorov-Smirnov test, and, by default, approximates the \(p\) value using bootstrap methods. Alternatively, the function also allows \(p\) value computation based on the asymptotic distribution given in (2). The function CvMcens() in the GofCens package calculates the observed Cramér-von Mises statistic adapted for right-censored data, and provides an approximation of the corresponding \(p\) value using bootstrap methods. The function ADcens() in the GofCens package calculates the observed Anderson-Darling statistic for right-censored data and provides an approximation of the corresponding \(p\) value using bootstrap methods. The chisqcens() function in the GofCens package uses bootstrap techniques to compute the \(p\) values.

Bootstrap methods for right-censored data

Since the asymptotic distributions of the described goodness-of-fit statistics are unknown when the null hypothesis is not fully specified, bootstrap techniques offer a practical alternative for computing \(p\) values. The bootstrap method is as well a practical alternative if \(H_0\) is fully specified, but the asymptotic distribution depends on a complex expression. This section describes the application of bootstrapping to derive \(p\) values for tests presented in the previous subsection. We use the notation \(G_n\) to refer to any of the following test statistics: the Kolmogorov–Smirnov statistic defined in (1), the Cramér–von Mises statistic in (3), the Anderson–Darling statistic in (4), and the chi-squared statistic in (5).

To test the null hypothesis \(H_0\!:F(t) = F_0(t; \theta)\) using complete and right-censored data, we follow the guidelines of Capasso et al. (2009). Accordingly, the observed data is utilized to estimate the parameter \(\theta\), denoted as \(\hat\theta_n\), using maximum likelihood estimation. Subsequently, we generate \(B\) independent bootstrap samples of the same size (\(n\)) as the original data set as follows. Specifically, for each iteration \(b\), \(b = 1, \dotsc, B\), we compute the bootstrap statistic \((\widehat{G}_n^{\hat{\theta}_n})_b\) carrying out the following steps:

  1. Generation of survival times \(T_1^b, \dotsc, T_n^b\) from the fitted distribution \(F_0(t;\hat{\theta}_n)\).

  2. Generation of censoring times \(C_1^b, \dotsc, C_n^b\) from the non-parametric estimation of \(H\) obtained with the Kaplan-Meier estimator.

  3. Generation of observed survival times \(Y_i^b = \min(T_i^b, C_i^b)\), and event indicators \(\delta_i^b = \mathbf{1}\{T_i^b \le C_i^b\}, i = 1, \dotsc n\).

  4. Maximum likelihood estimation of the parameter, \(\hat{\theta}_n^b\), given \((Y_i^b, \delta_i^b), i = 1, \dotsc n\).

  5. Computation of the bootstrap statistic, \((\widehat{G}_n^{\hat{\theta}_n})_b\).

By repeating this process for many bootstrap samples —the default value of the functions KScens(), CvMcens(), ADcens(), and chisqcens() is \(B = 999\)—, the sequence of bootstrap statistics, \((\widehat{G}_n^{\hat{\theta}_n})_b\), \(b = 1, \cdots, B\), represents the empirical distribution of the statistic \(G_n\) under the null hypothesis.

The \(p\) value is calculated by comparing the observed test statistic \(\widehat{G}_n\) to the empirical distribution of the bootstrap statistic. Specifically, the \(p\) value is the proportion of bootstrap statistic values \((\widehat{G}_n^{\hat{\theta}_n})_b\) greater than or equal to the observed statistic \(\widehat{G}_n\): \[p = \frac{1}{B + 1} \big(\sum_{i=1}^B \mathbf{1}\{\widehat{G}_n \leq (\widehat{G}_n^{\hat{\theta}_n})_b\} + 1\big),\] where adding 1 in both the numerator and the denominator avoids a zero \(p\) value.

Note that steps 2 and 3, which involve generating the right-censored sample, are crucial. This is because the distribution of \(G_n\) depends not only on \(F_0\) but also on \(H\) and the proportion of censored data. The functions KScens(), CvMcens(), ADcens(), and chisqcens() are designed so that, if the original data contains no right-censored observations, the censoring times in step 2 are automatically set to \(\infty\).

Moreover, as explained and illustrated in Section ‘Goodness-of-fit tests’ below, these functions allow testing the null hypothesis of a fully specified choice of the parameters, \(H_0\!:F(t) = F_0(t;\theta^*)\). In this case, the survival times in step 1 are generated from \(F_0(t;\theta^*)\).

Goodness-of-fit plots

The GofCens package provides six different plots for evaluating goodness of fit, applicable to both right-censored and complete data. In most cases, a roughly straight line formed by the points suggests good agreement with the hypothesized theoretical distribution.

Probability-probability and Stabilized Probability plots

The probability-probability plot (P-P plot) maps \(\widehat{F}_0(t)\) against \(\widehat{F}_n(t)\), where \(\widehat{F}_0(t)\) corresponds to either \(F_0(t;\theta)\) with the unknown parameters replaced by their maximum likelihood estimates or to \(F_0(t;\theta^*)\). \(\widehat{F}_n\) represents either the empirical distribution function or 1 minus the Kaplan-Meier estimator of the survival function, if right-censored data are present.

In order to enhance the interpretability of the P-P plot and because some of the plotted points have a larger variability than others, Michael (1983) proposed the Stabilized Probability plot (SP plot). In the same way as the \(\arcsin\) transformation can be used to stabilize the variance of a uniform order statistic, this function can be applied to stabilize the variance of \(\widehat{F}_0(t)\). The variances of the resulting SP plotted points are all approximately equal.

Quantile-quantile plot

The quantile-quantile plot (Q-Q plot) maps the theoretical quantiles against the estimated quantiles, that is, \(\widehat{F}_0^{-1}(\widehat{F}_n(t))\) against \(t\). An empirical rescaling of the axes can help to overcome the problem that the plotted points might not be evenly spread, in particular in the presence of right-censored data. The Empirically Rescaled plot (ER plot) plots \(\widehat{F}_u(\widehat{F}_0^{-1}(\widehat{F}_n(t)))\) against \(\widehat{F}_u(t)\), where \(\widehat{F}_u(t)\) is the empirical cumulative distribution function of the points corresponding to the uncensored observations (Waller and Turnbull 1992).

Table 1 provides an overview of the axes of the probability and quantile-quantile plots.

Table 1: Probability and quantile-quantile plots to assess goodness of fit.
Plot Abscissa Ordinate
P-P plot \(\widehat{F}_n(t)\) \(\widehat{F}_0(t)\)
Q-Q plot \(t\) \(\widehat{F}_0^{-1}(\widehat{F}_n(t))\)
SP plot \(\frac{\pi}{2} \arcsin\big(\widehat{F}_n(t)^{\frac12}\big)\) \(\frac{\pi}{2} \arcsin\big(\widehat{F}_0(t)^{\frac12}\big)\)
ER plot \(\widehat{F}_u(t)\) \(\widehat{F}_u(\widehat{F}_0^{-1}(\widehat{F}_n(t)))\)
Cumulative hazard and Kaplan-Meier plots

Another set of plots consists of cumulative hazard plots, derived from transforming the cumulative hazard function \(\Lambda\) in such a way that it becomes linear in \(t\) or in \(\log(t)\). The Nelson-Aalen estimator \(\widehat{\Lambda}\) of \(\Lambda\) is computed from the data, and the distribution-specific transformation \(A(\widehat{\Lambda}(t))\) is then plotted against either \(t\) or its natural logarithm. The transformations for all the distributions considered in the GofCens package are detailed in Table 2.

In addition, the GofCens package includes a function to graphically compare the Kaplan-Meier estimate of the survival function (\(1 - \widehat{F}_n(t)\)) with the parametric estimations of the survival function from the parametric models under study (\(1 - \widehat{F}_0(t)\)).

Table 2: List of the distributions that are implemented in the GofCens package. The last column shows the specific transformations of time and the estimated cumulative hazard function of the cumulative hazard plots, which are implemented by the function cumhazPlot(). The shape parameters are denoted by \(\alpha > 0\) and \(\gamma > 0\), the location parameter by \(\mu \in R\), and the scale parameter by \(\beta > 0\). The time \(t\) will be always positive. \(B(\alpha, \gamma)\) denotes the Beta function and \(B_{(\alpha,\gamma, a, b)}(t) = \int_0^{\frac{t - a}{b - a}} u^{\alpha - 1} (1-u)^{\gamma - 1}du\).
Distribution Survival function Cum. hazard function Plot
Exponential [Exp\((\beta)\)] \(\displaystyle e^{-\frac{t}{\beta}}\) \(\frac{t}{\beta}\) \(\widehat{\Lambda}(t)\) vs \(t\)
Weibull [Wei(\(\alpha,\,\beta\))] \(\displaystyle e^{-(\frac{t}{\beta})^\alpha}\) \((\frac{t}{\beta})^\alpha\) \(\log \widehat{\Lambda}(t)\) vs \(\log t\)
Gumbel [Gum(\(\mu,\,\beta\))] \(\displaystyle 1 - e^{-e^{-\frac{t-\mu}{\beta}}}\) \(-\log\big(1 - e^{-e^{-\frac{t-\mu}{\beta}}}\big)\) \(-\log\big(\!-\log(1 - e^{-\widehat{\Lambda}(t)})\big)\) vs \(t\)
Log-Logistic [LLogis(\(\alpha, \beta\))] \(\displaystyle \frac{1}{1 + \left(\frac{t}{\beta}\right)^\alpha}\) \(\log\big(1 + \big(\frac{t}{\beta}\big)^\alpha\big)\) \(\log\big(e^{\widehat{\Lambda}(t)} - 1\big)\) vs \(\log t\)
Logistic [Logis(\(\mu,\beta\))] \(\displaystyle \frac{e^{-\frac{t -\mu}{\beta}}}{1 + e^{-\frac{t - \mu}{\beta}}}\) \(\log\big(1 + e^{-\frac{t - \mu}{\beta}}\big)\) \(\log\big(e^{\widehat{\Lambda}(t)} - 1\big)\) vs \(t\)
Log-Normal [LN(\(\mu,\beta\))] \(\displaystyle\int\limits_{\frac{\log t - \mu}{\beta}}^\infty \!\frac{1}{\sqrt{2 \pi}} e^{-\frac{x^2}{2}}dx\) \(-\log\big(1-\Phi\big(\frac{\log t-\mu}{\beta}\big)\big)\) \(\Phi^{-1}\big(1-e^{-\widehat{\Lambda}(t)}\big)\) vs \(\log t\)
Normal [N(\(\mu,\beta\))] \(\displaystyle \int\limits_t^\infty \! \frac{1}{\beta\sqrt{2\pi}} e^{-\frac{(x - \mu)^2}{2 \beta^2}} dx\) \(-\log\big(1-\Phi\big(\frac{t-\mu}{\beta}\big)\big)\) \(\Phi^{-1}\big(1 - e^{-\widehat{\Lambda}(t)}\big)\) vs \(t\)
4-Parameter Beta [Beta(\(\alpha, \gamma, a, b\))] \(\displaystyle 1 - \frac{B_{(\alpha, \gamma, a, b)}(t)}{B(\alpha, \gamma)}\) \(-\log\big(1 - \frac{B_{(\alpha, \gamma, a, b)}(t)}{B(\alpha,\gamma)}\big)\) \(B^{-1}_{(\alpha, \gamma, a, b)} \big(B(\alpha, \gamma) \big(1 - e^{-\widehat{\Lambda}(t)}\big)\big)\) vs \(t\)

3 The GofCens package

Installation and dependencies

The GofCens package can easily be installed from any CRAN repository by running the R command

R> install.packages("GofCens")

As shown on the corresponding web page of the CRAN (https://cran.r-project.org/web/packages/GofCens/index.html), GofCens depends on the packages survival (Therneau and Grambsch 2000) and actuar (Dutang et al. 2008; Goulet et al. 2025). The former is needed to estimate the survival and cumulative hazard functions, whereas the latter implements several distributions that are not available in standard R packages. Additionally, GofCens imports functions from the following packages: the fitdist() and fitdistcens() functions of the fitdistrplus package (Delignette-Muller and Dutang 2015; Delignette-Muller et al. 2025) provide the estimation of the distributions’ parameters; the packages ggplot2 (Wickham 2016; Wickham et al. 2025), gridExtra (Auguie 2017), and the base package grid are needed for several features of the graphical tools; and the bootstrap methods to compute the \(p\) values of the Kolmogorov-Smirnov, Cramér-von Mises, Anderson-Darling, and chi-squared type tests are implemented with the function boot() of the boot package (Canty and Ripley 2024). Moreover, the is() function from the base package methods is used internally to verify object classes during function execution.

Package overview

The GofCens package provides both graphical tools and statistical test functions to assess goodness of fit for the distributions shown in Table 2.

The graphical functions, which facilitate visual evaluation of the model adequacy, include:

The statistical test functions, which implement goodness-of-fit tests, include:

These functions share several features: although they are primarily designed for right-censored data, they can also be applied to complete datasets. They provide parameter estimates for the distributions under study and support two input formats: users can either provide vectors with the observed survival times and the corresponding event indicator, or use the common formula interface with a Surv object. Earlier versions of the functions were implemented in R by Febrer Galvany (2015), Besalú Mayol (2016), and García Carrasco (2017).

Specific features of the statistical test functions include the following: On the one hand, these functions can be used to assess the goodness of fit for any distribution, provided that the corresponding pname, dname, and rname functions are available. On the other hand, they return test-specific objects for which print, summary, and print.summary methods are implemented.

In addition, the GofCens package comes with a data frame called nba, which contains the survival times of 3962 former players of the National Basketball Association (NBA) until 2019. These data were analyzed previously by Martínez et al. (2022).

A simulated example

Following, we will present the main features of the eight functions and illustrate their use with simulated survival times. For this purpose, we generate 300 survival times from a log-normal distribution with location parameter \(\mu = 2\) and scale parameter \(\beta = 1\), i.e., \(T \sim LN(2,\,1)\), and 300 censoring times from an exponential distribution with scale parameter \(\beta = 20\), i.e, \(C \sim Exp(20)\):

R> set.seed(123)
R> survt <- round(rlnorm(300, 2, 1), 2)
R> censt <- round(rexp(300, 1 / 20), 2)

The observed right-censored survival times, \(Y = \min(T,\,C)\), and the corresponding event indicators, \(\delta = \mathbf{1}\{T \le C\}\), are computed as follows:

R> times <- pmin(survt, censt)
R> delta <- as.numeric(survt <= censt)

In total, 106 (35.3%) survival times of the generated sample are right-censored:

R> table(delta)
  0   1
106 194

Graphical tools: Functions probPlot(), cumhazPlot(), and kmPlot()

Function probPlot()

The probPlot() function allows users to assess graphically how well any of the distributions in Table 2 fit to a sample of right-censored survival or complete times. For this purpose, the function implements the four probability and quantile-quantile plots in Table 1. By default, the function draws all four plots, but provides users with the option to choose only a subset of these. If the user does not provide specific distribution parameters, the maximum likelihood estimates are used to draw the plots. One graphical feature of the function is the option for users to choose between the base package graphics, which is the default, and the ggplot2 package for drawing the plots. In the former case, probPlot() internally calls the par() function, enabling users to set graphical parameters within the function call.

For the illustration of the probPlot() function, we assess how well the log-normal distribution (left panel of Figure 1) and the Weibull distribution (right panel of Figure 1) fit to the sample data. The four plots in the left panel show fairly straight lines, thus confirming that the underlying distribution is the log-normal distribution, whereas the plots in the right panel clearly show that the Weibull distribution would not be an adequate distribution to model the data.

The R code used to draw the plots in the left and right panels of Figure 1 is the following:

R> library("GofCens")
R> probPlot(times, delta, distr = "lognormal", cex.lab = 1.3)
R> probPlot(times, delta, distr = "weibull", ggp = TRUE)

graphic without alt text graphic without alt text

Figure 1: Probability plots applied to a right-censored sample from a log-normal distribution assuming log-normal (left panel) and Weibull distributions (right panel). Both figures are drawn with the probPlot() function.

To illustrate additional functionalities of this function, we apply it again to the data set to assess the suitability of fitting a log-normal distribution with parameters \(\mu = 2\) and \(\beta = 1\). For this purpose, we employ the params0 argument, whose default value is NULL, to specify the parameter values. Additionally, the main argument is a Surv object and we utilize the m argument to display the four probability plots in a single row, as shown in Figure 2. From these plots, it becomes evident that this log-normal distribution does not fit well to the data. Furthermore, we set prnt = TRUE so that the function provides both the parameter values used in the probability plots and the maximum likelihood estimates of the parameters, together with the values of the Akaike (AIC) and the Bayesian (BIC) information criteria.

R> probPlot(Surv(times, delta) ~ 1, distr = "lognormal", m = matrix(1:4, 
+    nrow = 1), params0 = list(location = 2, scale = 1.5), ggp = TRUE, prnt = TRUE)

Distribution: log-normal 

Parameters used in probability plots:
Location: 2 
   Scale: 1.5 

Parameter estimates:
Location (se): 1.988 (0.058)
   Scale (se): 0.883 (0.045)

   AIC: 1266.899 
   BIC: 1274.307    
graphic without alt text
Figure 2: Probability plots applied to a right-censored sample from a log-normal distribution assuming a log-normal distribution with specific parameters (\(\mu = 2, \beta = 1.5\)). The plots are drawn with the probPlot() function.
Function cumhazPlot()

The cumhazPlot() function can be used to generate cumulative hazard plots for any of the distributions listed in Table 2. As previously noted, the cumulative hazard plot is based on a transformation of the cumulative hazard function to achieve linearity in either \(t\) or \(\log(t)\). Consequently, given a set of survival times, this function serves as a useful tool to assess which parametric model best fits the data.

By default, the function generates cumulative hazard plots for the Weibull, Gumbel, log-logistic, logistic, log-normal, and normal distributions. Additionally, users can opt to choose the exponential and beta distributions or select any subset of these distributions. Similar to the probPlot() function, users can choose between the base graphics package (default) or ggplot2 to draw the plots.

Setting the argument prnt to TRUE (default is FALSE), the function also returns the maximum likelihood estimates for the specified distribution, along with the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) values.

An illustration with the previously generated right-censored sample from the log-normal distribution is shown in Figure 3. As expected, the points of the cumulative hazard plot for the log-normal distribution show a fairly straight line, whereas most of the other distributions under study can clearly be discarded. This conclusion is further supported by comparing the AIC and BIC values across all distributions, with the log-normal distribution showing the lowest values.

The corresponding code is the following, where argument degs = 2 is used to return the parameter estimates, which are shown below, with two decimal digits (default is degs = 3).

R> cumhazPlot(times, delta, font.lab = 4, cex.lab = 1.3, degs = 2, prnt = TRUE)

Parameter estimates

weibull
   Shape (se): 1.27 (0.07)
   Scale (se): 10.98 (0.62)
   AIC: 1304.05 
   BIC: 1311.45 

loglogistic
   Shape (se): 1.94 (0.11)
   Scale (se): 7.18 (0.42)
   AIC: 1272.48 
   BIC: 1279.89 

lognormal
Location (se): 1.99 (0.06)
   Scale (se): 0.88 (0.05)
   AIC: 1266.9 
   BIC: 1274.31 

gumbel
Location (se): 6.38 (0.33)
   Scale (se): 4.95 (0.3)
   AIC: 1342.49 
   BIC: 1349.9 

logistic
Location (se): 8.51 (0.43)
   Scale (se): 3.84 (0.23)
   AIC: 1430.25 
   BIC: 1437.66 

normal
Location (se): 9.89 (0.49)
   Scale (se): 7.54 (0.38)
   AIC: 1456.22 
   BIC: 1463.63 
graphic without alt text
Figure 3: Cumulative hazard plots for six different distribution applied to a right-censored sample from a log-normal distribution. The plots are drawn with the cumhazPlot() function.

If users want to assess the goodness of fit of the exponential and the beta distribution and compare these with the fit of the log-normal distribution, they must specifically select these distributions, as shown in the following example. Notice that users need to provide the limits for the beta distribution, as the default domain, \((0,\,1)\), does not cover the range of the sample’s survival times. In this example, plots are drawn using the ggplot2 package (ggp = TRUE), and parameter estimates are not displayed on the screen. The resulting plots are shown in Figure 4, where we observe that neither the exponential nor the beta distribution fits the data well. The corresponding R code is the following.

R> cumhazPlot(times, delta, distr = c("exponential",  "beta", "lognormal"),
+    betaLimits = c(0, 100), ggp = TRUE)
graphic without alt text
Figure 4: Cumulative hazard plots for the exponential, beta, and log-normal distribution applied to a right-censored sample from a log-normal distribution. The domain of the beta distribution is \((0,\,100)\).
Function kmPlot()

The kmPlot() function allows users to graphically compare the nonparametric Kaplan-Meier estimate of the survival function for a right-censored or complete sample of survival times with parametric estimates based on different models. For this purpose, each parametric estimate is individually overlaid on the Kaplan-Meier estimator of \(S(t)\). Likewise the cumhazPlot() function, this is done by default for the Weibull, Gumbel, log-logistic, logistic, log-normal, and normal distributions, with the option to also include the exponential and beta distributions. The plots are generated using the graphics package by default, but setting ggp = TRUE enables the use of the ggplot2 package. With both options, the pointwise 95% confidence intervals are plotted.

In the example shown in Figure 5, the estimated survival functions for the six default distributions are overlaid on the nonparametric estimate of \(S(t)\). As expected, the log-normal distribution exhibits the best fit to the data. However, based on these plots, the log-logistic distribution might also be considered as a parametric model that fits the sample data well.

The R code used to draw the Kaplan-Meier plots in Figure 5 is the following:

R> kmPlot(times, delta, ggp = TRUE)
graphic without alt text
Figure 5: Kaplan-Meier plots for six different distribution applied to a right-censored sample from a log-normal distribution. The shaded area represent the pointwise 95% confidence intervals of the survival function. The plots are drawn with the kmPlot() function.

Goodness-of-fit test functions

In this section, we briefly describe and illustrate the use of the five functions in the GofCens package that implement the tests presented in the ‘Methods’ section.

The basic print methods of all functions return the test statistic and the p-value, whereas the summary methods do also provide the maximum likelihood estimates for the parameters of the hypothesized distribution \(F_0(t; \theta)\) as well as the values of both the AIC and BIC. The functions have also in common that they offer the option to test the null hypothesis for specific parameter values of \(F_0\), i.e., \(F_0(t; \theta^*)\). Each function returns an object of its own class, enabling the use of object-specific print and summary methods.

Function KScens()

The KScens() function enables users to assess the goodness of fit using the Kolmogorov-Smirnov test adapted to right-censored data. It supports the eight distributions in Table 2 as well as any other distribution for which the corresponding pname, dname, and rname functions are implemented. The function computes the test statistic in ((1)) and, by default, estimates the \(p\) value using bootstrap methods.

As illustrated below with the data from the generated right-censored sample, the function returns, by default, a list with two elements:

Additionally, the summary method provides two more elements:

In the illustration, we run the KScens() function to assess the goodness of fit of the log-normal and the Weibull distributions, first using bootstrapping to approximate the \(p\) value, and second calculating the \(p\) value following the results in Fleming et al. (1980). Additionally, in the second example, we apply the function summary method, which returns the parameter estimates, their standard errors and both the AIC and BIC. As shown, the \(p\) value obtained for the log-normal distribution is, as expected, larger (0.115) than the \(p\) value (0.042) for the Weibull distribution.

R> set.seed(123)
R> KScens(times, delta, distr = "lognormal")

Null hypothesis: the data follows a lognormal distribution 

KS Test results:
      A p-value 
  0.781   0.115 
R> set.seed(123)
R> summary(KScens(times, delta, distr = "weibull", boot = FALSE))

Distribution: weibull 

KS Test results:
      A p-value 
  1.391   0.042 

Parameter estimates (se):
shape             scale             
1.273 (0.067)     10.98 (0.621)     

AIC: 1304.046 
BIC: 1311.453
Functions CvMcens() and ADcens()

The CvMcens() function calculates the test statistic of the Cramér-von Mises test in ((3)), while the ADcens() function calculates the test statistic of the Anderson-Darling test in ((4)), and both employ bootstrap techniques to estimate the corresponding \(p\) values. These functions share the same arguments, allowing users to specify the number of bootstrap samples, among other options. The default number of bootstrap samples for both functions is 999, which may result in lengthy computation times for large sample sizes. In such cases, users might consider reducing the number of bootstrap samples. For example, in the illustrations of the CvMcens() function below, with \(n = 300\) and 999 bootstrap samples, the computation times are approximately 15 seconds using R version 4.5.0 on Windows 10 x64 (build 19045) with an Intel(R) Core(TM) i5-8500 processor (3.00 GHz).

The null hypotheses in the illustrations remain the same as before, as do the conclusions: we would conclude that the data may come from a log-normal distribution, but not from a Weibull distribution.

R> set.seed(123)
R> summary(CvMcens(times, delta, distr = "lognormal"))

Distribution: lognormal 

CvM Test results:
    CvM p-value 
  0.054   0.450 

Parameter estimates (se):
location          scale             
1.988 (0.058)     0.883 (0.045)     

AIC: 1266.899 
BIC: 1274.307   
R> set.seed(123)
R> summary(CvMcens(times, delta, distr = "weibull"))

Distribution: weibull 

CvM Test results:
    CvM p-value 
  0.376   0.001 

Parameter estimates (se):
shape             scale             
1.273 (0.067)     10.98 (0.621)     

AIC: 1304.046 
BIC: 1311.453  

Applying the function ADcens() in the same way as CvMcens(), the \(p\) values are similar: \(p = 0.52\) in the case of the log-normal distribution and \(p = 0.001\) with the Weibull distribution.

Note that due to the use of resampling methods, with both functions it is advisable to previously set the seed of R’s random number generation in order to guarantee reproducibility of the test results.

As mentioned before, users can also test the null hypothesis that the data come from a specific choice of the distribution under study, i.e., \(H_0\!: F(t) = F_0(t; \theta^*)\). While this may be less relevant in practical applications, it can be useful for examining the properties of these tests. To perform such a test, users need to specify particular values for the distribution’s parameters. For example, below we test the null hypothesis that the data follow a log-normal distribution with location and scale parameters \(\mu = 2\) and \(\beta = 1.5\). In this case, as shown, the Anderson-Darling test clearly rejects the null hypothesis (p = 0.001). Note that the output now includes both the specified parameter values for the null hypothesis and the maximum likelihood estimates. In addition, we use the outp argument of the summary.ADcens() function to change the format of the output.

R> set.seed(123)
R> summary(ADcens(times, delta, distr = "lognormal",
+    params0 = list(location = 2, scale = 1.5)), outp = "table")

Distribution: lognormal 

Null hypothesis:
--------- | ---------
Parameter | Value    
--------- | ---------
location  | 2        
scale     | 1.5      
--------- | ---------

AD Test results:
------- | -------
Metric  | Value  
------- | -------
AD      | 9.536  
p-value | 0.001  
------- | -------

Parameter estimates:
--------- | --------- | ---------
Parameter | Value     | s.e.     
--------- | --------- | ---------
location  | 1.988     | 0.058    
scale     | 0.883     | 0.045    
--------- | --------- | ---------

AIC: 1266.899 
BIC: 1274.307 
Function gofcens()

Rather than performing each of the three tests individually, users can run the Kolmogorov-Smirnov, Cramér-von Mises, and Anderson-Darling tests simultaneously with the gofcens() function, which calls each corresponding test function. This approach is especially useful for comparing the tests in a specific context or in a larger study, such as comparing their statistical power. However, a potential drawback is longer computation times, as each test function employs bootstrap methods to estimate the \(p\) values. In the following examples, using the same data and null hypotheses as before, the computation time is approximately 20 seconds on R version 4.5.0, running on Windows 10 x64 (build 19045) with an Intel(R) Core(TM) i5-8500 processor (3.00 GHz). Note that in the second example, we modify options in the summary.gofcens() function: the number of bootstrap samples is displayed, while the AIC and BIC values are omitted.

R> set.seed(123)
R> gofcens(times, delta, distr = "lognormal")

Null hypothesis: the data follows a lognormal distribution 

Test statistics
   KS   CvM    AD 
0.781 0.054 0.500 

p-values
   KS   CvM    AD 
0.115 0.450 0.520 
R> set.seed(123)
R> summary(gofcens(times, delta, distr = "weibull"), print.AIC = FALSE,
+    print.BIC = FALSE, print.infoBoot = TRUE)

Distribution: weibull 

Test statistics
   KS   CvM    AD 
1.391 0.376 2.828 

p-values
   KS   CvM    AD 
0.001 0.001 0.001 

Parameter estimates (se):
shape             scale             
1.273 (0.067)     10.98 (0.621)     

Number of bootstrap samples: 999
Function chisqcens()

The chisqcens() function implements the chi-squared type test of Kim (1993) explained in Section ‘Chi-squared type tests’. Likewise the KScens(), CvMcens(), and ADcens() functions, the computation of the \(p\) value is achieved with bootstrap techniques using BS = 999 as default value for the number of bootstrap samples. The function’s default output includes the hypothesized distribution, the value of the test statistic, and the corresponding \(p\) value. The extended output provided by the summary.chisqcens() function also includes the parameter estimates, the the AIC and BIC values as well as two values related to the number of random cells: the number initially chosen by the user (Original) and the final number used in the analysis (Final). The latter may be smaller due to the presence of right-censored data. For example, in the illustrations below using right-censored data from a log-normal distribution, we set M = 8 random cells, but this number is reduced to M = 7 due to censoring.

R> set.seed(123)
R> chisqcens(times, delta, M = 8, distr = "lognormal")

Null hypothesis: the data follows a lognormal distribution 

Chi-squared Test results:
Statistic   p-value 
    3.123     0.154 
R> set.seed(123)
R> summary(chisqcens(times, delta, M = 8, distr = "weibull"))

Distribution: weibull 

Chi-squared Test results:
Statistic   p-value 
    8.599     0.002 

Parameter estimates (se):
shape             scale             
1.273 (0.067)     10.98 (0.621)     

Cell numbers:
Original    Final 
       8        7 

AIC: 1304.046 
BIC: 1311.453 

Again, based on the outputs, we would choose the log-normal distribution instead of the Weibull distribution, because its value of the test statistic is clearly smaller (3.123 vs 8.599) and the \(p\) value is far larger (\(p = 0.154\) vs \(p = 0.002\)).

The following illustration highlights two features of the goodness-of-fit test functions in the GofCens package. First, all functions return structured objects that can be stored as standard objects for further use in R. Second, users can test the goodness-of-fit of distributions different from the eight distributions in Table 2. In this example, we apply the chisqcens() function using the gamma distribution. As shown, the parameter estimates are reported with the default names theta1 and theta2, and the result is an object of class chisqcens, further processed with the summary method to display detailed results.

R> set.seed(123)
R> result <- summary(chisqcens(times, delta, M = 8, distr = "gamma")) 
R> result

Distribution: gamma 

Chi-squared Test results:
Statistic   p-value 
    6.176     0.006 

Parameter estimates (se):
theta1            theta2            
1.65 (0.144)     0.165 (0.019)     

Cell numbers:
Original    Final 
       8        7 

AIC: 1293.124 
BIC: 1300.532    
R> class(result)

[1] "summary.chisqcens" "chisqcens"

4 Real data example: Survival times of retired NBA players

In this section, we apply several functions of the GofCens package to determine the parametric model that best fits the survival times of former NBA players.

The data frame nba comes with the GofCens package and contains the survival times (variable survtime) of all 3962 former players of the National Basketball Association (NBA) from its inception until July 2019. These data have been published and analyzed by Martínez et al. (2022), where survival times are measured as the elapsed time (in years) from the end of the NBA career until either death (cens == 1) or July 31, 2019 (cens == 1). By this date, 864 (21.8%) of the former players had died, with uncensored post-NBA survival times ranging from a few days until nearly 70 years. The estimated median survival time is 54.1 years as shown below in the output of the function npsurv of the rms package (Harrell Jr 2025) and Figure 6, which shows the Kaplan-Meier estimate of the survival function.

R> data("nba")
R> library("rms")
R> npsurv(Surv(survtime, cens) ~ 1, nba)

Call: npsurv(formula = Surv(survtime, cens) ~ 1, data = nba)
        n events median 0.95LCL 0.95UCL
[1,] 3962    864   54.1    53.1    55.1    

The R code to draw the plot in Figure 6 is the following:

R> par(las = 1, cex.lab = 1.3, cex.axis = 1.2, font.lab = 4, font.axis = 2,
+    mar = c(5, 5, 2, 2), yaxs = "i", xaxs = "i")
R> survplot(npsurv(Surv(survtime, cens) ~ 1, nba), lwd = 3,
+    xlab = "Years after NBA career", time.inc = 5, col.fill = grey(0.6),
+    ylab = "Estimated survival probability", xlim = c(0, 75))
R> abline(h = 0.5, lwd = 2, lty = 2)
graphic without alt text
Figure 6: Survival function of times from the end of the NBA career until death among retired players. The shaded area represents the confidence bands of the survival function.

In order to model parametrically the survival times and estimate the median, we need to know which distribution is the most appropriate distribution. For this purpose, we take advantage of the cumhazPlot() function, which provides the six cumulative hazard plots shown in Figure 7:

R> cumhazPlot(Surv(survtime, cens) ~ 1, nba, font.lab = 4, cex.lab = 1.3, 
+    lwd = 3, colour = "blue")
graphic without alt text
Figure 7: Cumulative hazard plots for six different distributions applied to the survival times of former NBA players. The plots are generated with the cumhazPlot() function.

According to the plots in Figure 7, the logistic distribution fits reasonably well to the data, even though the corresponding plot does not show a completely straight line of the points. In addition, we could also consider the normal distribution for parametric analyses of the survival times. To choose one of either distributions, we run the following code to draw the probability and quantile-quantile plots that are shown in the left and right panels of Figure 8 with the probPlot() function.

R> probPlot(Surv(survtime, cens) ~ 1, nba, distr = "logistic", ggp = TRUE)
R> probPlot(Surv(survtime, cens) ~ 1, nba, distr = "normal", ggp = TRUE)

According to the plots in Figure 8, the logistic distribution appears to be a slightly better choice than the normal distribution, as the points in all four plots on the right panel show some curvature.

Note that in the previous function calls, we used the formula versions of both functions, which allows for the inclusion of the data frame nba in the argument list, which simplifies their use.

graphic without alt text graphic without alt text

Figure 8: Logistic and normal probability plots for the survival times of former NBA players. The plots are drawn with the probPlot() function.

Finally, we apply the gofcens() function to both distributions in order to complement the plots with the results of the Kolmogorov-Smirnov, Cramér-von Mises, and Anderson-Darling tests.

R> set.seed(123)
R> summary(gofcens(Surv(survtime, cens) ~ 1, nba, distr = "logistic"))

Distribution: logistic 

Test statistics
    KS    CvM     AD 
 2.099  1.277 13.849 

p-values
   KS   CvM    AD 
0.001 0.009 0.002 

Parameter estimates (se):
location          scale             
53.077 (0.457)     9.013 (0.218)     

AIC: 8460.297 
BIC: 8472.866  
R> set.seed(123)
R> summary(gofcens(Surv(survtime, cens) ~ 1, nba, distr = "normal"))

Distribution: normal 

Test statistics
    KS    CvM     AD 
 2.736  2.666 22.957 

p-values
   KS   CvM    AD 
0.001 0.002 0.001 

Parameter estimates (se):
location          scale             
53.004 (0.518)     16.977 (0.368)     

AIC: 8512.579 
BIC: 8525.148

Following the test results, we would reject the null hypothesis in both cases: the Kolmogorov-Smirnov, Cramér-von Mises, and Anderson-Darling tests all yield small \(p\)-values, indicating that neither the logistic nor the normal distribution are adequate models for the survival times of former NBA players. This result is not unexpected given the large sample size, which increases the power of the tests and makes even minor deviations from the model detectable. Despite the rejection, the diagnostic plots in Figures 7 and 8 suggest that the logistic distribution provides a somewhat better fit to the data than the normal distribution. Therefore, it seems reasonable to use the logistic model as a practical approximation. Based on the maximum likelihood estimates, the fitted parameters are \(\hat\mu = 53.08\) and \(\hat\beta = 9.01\), yielding an estimated median survival time of approximately 53.1 years, which is close to the non-parametric estimate of 54.1 years.

5 Conclusion

In this paper we have presented the R package GofCens, whose functions offer both analytical methods and graphical tools to assess the goodness of fit for right-censored lifetime data. While the graphical functions can be used to assess the goodness of fit for the eight distributions listed in Table 2, which represent the most common ones in survival analysis, the goodness-of-fit test functions are more flexible and can be applied to any distribution available in R. GofCens incorporates new functions for goodness-of-fit analyses while unifying functions that are otherwise scattered across various R packages. As shown with several examples, this package is a user-friendly tool that enables users to determine the most suitable parametric distribution for analyzing their data.

For the analysis of a given data set in practice, it is important to highlight the usefulness of including probability or cumulative hazards plots alongside the goodness-of-fit tests provided in GofCens. While statistical tests offer summary statistics, such plots offer a comprehensive view of the data, thus revealing peculiarities that might escape detection through summary statistics alone. Moreover, the importance of selecting a specific distribution depends on the context of the analysis. For example, precise distributional assumptions may be less critical when evaluating test statistics, but become essential when estimating quantiles or making extrapolations. The selection of any of the four tests presented herein is, to some extent, subjective, influenced by personal preference and convenience, as none of them emerges as optimal for all censoring situations, parametric families, or sample sizes.

In recent years, additional goodness-of-fit techniques for right-censored data have been developed, including transformation-based procedures (Goldmann et al. 2015) and modifications of classical tests using pseudo-complete samples or randomization strategies (Balakrishnan et al. 2015). While these methods can offer improved performance in specific settings, they often involve specialized implementation or focus on narrower applications. In contrast, GofCens implements established, flexible approaches with broad applicability and interpretability, supporting classical tests enhanced by resampling and graphical diagnostics.

Feasible extensions of this package include expanding the range of distributions supported by the graphical functions. Since, for example, the cumhazPlot() function requires distribution-specific transformations of the cumulative hazard function, such an extension is not straightforward. Additional developments could involve adapting its functions to handle left-truncated and interval-censored data. Furthermore, exploring residual-based methods to assess the parametric assumptions an accelerated failure time model may be a promising direction for further development.

Acknowledgements

M. Besalú’s, G. Gómez Melis’ and K. Langohr’s research was funded by MICIU/AEI/10.13039/2501100011033 and FEDER, UE [PID2023-148033OB-C21] and by Agència de Gestió d’Ajuts Universitaris i de Recerca [2021 SGR 01421] (GRBIO) from Generalitat de Catalunya (Spain).

We thank the anonymous referees for their valuable and significant feedback.

6 Note

This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.

B. Auguie. gridExtra: Miscellaneous functions for “grid” graphics. 2017. URL https://CRAN.R-project.org/package=gridExtra. R package version 2.3.
N. Balakrishnan, E. Chimitova and M. Vedernikova. An empirical analysis of some nonparametric goodness-of-fit tests for censored data. Communications in Statistics - Simulation and Computation, 44(4): 1101–1115, 2015.
M. Besalú Mayol. Testing goodness-of-fit of parametric survival models for right-censored data. 2016. URL https://upcommons.upc.edu/handle/2117/88820.
A. Canty and B. Ripley. boot: Bootstrap () functions. 2024. URL https://CRAN.R-project.org/package=boot. R package version 1.3-31.
M. Capasso, L. Alessi, M. Barigozzi and G. Fagiolo. On approximating the distributions of goodness-of-fit test statistics based on the empirical distribution function: The case of unknown parameters. Advances in Complex Systems, 12(2): 157–167, 2009.
M. L. Delignette-Muller and C. Dutang. fitdistrplus: An package for fitting distributions. Journal of Statistical Software, 64(4): 1–34, 2015.
M.-L. Delignette-Muller, C. Dutang, R. Pouillot, J.-B. Denis and A. Siberchicot. fitdistrplus: Help to fit of a parametric distribution to non-censored or censored data. 2025. URL https://CRAN.R-project.org/package=fitdistrplus. R package version 1.2-2.
C. Dutang, V. Goulet and M. Pigeon. actuar: An package for actuarial science. Journal of Statistical Software, 25(7): 38, 2008. URL https://www.jstatsoft.org/v25/i07.
A. Febrer Galvany. Analytical and graphical goodness of fit methods for parametric survival models with right-censored data. 2015. URL https://upcommons.upc.edu/handle/2099.1/26129.
T. R. Fleming, J. R. O’Fallon, P. C. O’Brien and D. P. Harrington. Modified Kolmogorov-Smirnov test procedures with application to arbitrarily right-censored data. Biometrics, 36(4): 607–625, 1980.
D. García Carrasco. Goodness-of-fit package for right-censored data. 2017. URL https://upcommons.upc.edu/handle/2117/106177.
C. Goldmann, B. Klar and S. Meintanis. Data transformations and goodness-of-fit tests for type-II right censored samples. Metrika, 78: 59–83, 2015.
V. Goulet, C. Dutang, M. Pigeon, et al. actuar: Actuarial functions and heavy tailed distributions. 2025. URL https://CRAN.R-project.org/package=actuar. R package version 3.3-5.
F. E. Harrell Jr. rms: Regression modeling strategies. 2025. URL https://CRAN.R-project.org/package=rms. R package version 8.0-0.
A. Janssen. Global power functions of goodness-of-fit tests. Annals of Statistics, 28: 239–253, 2000.
J. H. Kim. Chi-square goodness-of-fit tests for randomly censored data. The Annals of Statistics, 21(3): 1621–1639, 1993.
J. A. Koziol and D. P. Byar. Percentage points of the asymptotic distributions of one and two sample K-S statistics for truncated or censored data. Technometrics, 17(4): 507–510, 1975.
J. A. Koziol and S. B. Green. A Cramér-von Mises statistic for randomly censored data. Biometrika, 63(3): 465–474, 1976.
K. Langohr, M. Besalú, M. Francisco, A. Garcia and G. Gómez. GofCens: Goodness-of-fit methods for right-censored data. 2025. URL https://CRAN.R-project.org/package=GofCens. R package version 1.5.
E. L. Lehmann and J. P. Romano. Testing statistical hypotheses. 3rd ed New York: Springer-Verlag, 2005.
J. A. Martínez, K. Langohr, J. Felipo, L. Consuegra and M. Casals. Data set on mortality of National Basketball Association (NBA) players. Data in Brief, 45: 108615, 2022.
D. Meyer, E. Dimitriadou, K. Hornik, A. Weingessel and F. Leisch. e1071: Misc functions of the department of statistics, probability theory group (formerly: E1071), TU Wien. 2024. URL https://CRAN.R-project.org/package=e1071. R package version 1.7-16.
J. R. Michael. The stabilized probability plot. Biometrika, 70(1): 11–17, 1983.
D. P. Mihalko and D. S. Moore. Chi-square tests of fit for type II censored data. Annals of Statistics, 8(3): 625–644, 1980.
S. P. Millard. EnvStats: An package for environmental statistics. New York: Springer-Verlag, 2013.
S. P. Millard and A. Kowarik. EnvStats: Package for environmental statistics, including US EPA guidance. 2025. URL https://CRAN.R-project.org/package=EnvStats. R package version 3.1.0.
A. N. Pettitt and M. A. Stephens. Modified Cramér-von Mises statistics for censored data. Biometrika, 63(2): 291–298, 1976.
P. Royston. A toolkit for testing for non-normality in complete and censored samples. Journal of the Royal Statistical Society. Series D (The Statistician), 42(1): 37–43, 1993.
T. M. Therneau and P. M. Grambsch. Modeling survival data: Extending the Cox model. New York: Springer-Verlag, 2000.
L. A. Waller and B. W. Turnbull. Probability plotting with censored data. American Statistician, 46(1): 5–12, 1992.
H. Wickham. ggplot2: Elegant graphics for data analysis. New York: Springer-Verlag, 2016. URL https://ggplot2.tidyverse.org.
H. Wickham, W. Chang, L. Henry, et al. ggplot2: Create elegant data visualisations using the grammar of graphics. 2025. URL https://CRAN.R-project.org/package=ggplot2. R package version 3.5.2.

References

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Besalú, et al., "R package GofCens: Goodness-of-Fit Methods for Right-Censored Data", The R Journal, 2025

BibTeX citation

@article{RJ-2025-026,
  author = {Besalú, Mireia and Langohr, Klaus and Francisco, Matilde and Garcia-Fernández, Arnau and Melis, Guadalupe Gómez},
  title = {R package GofCens: Goodness-of-Fit Methods for Right-Censored Data},
  journal = {The R Journal},
  year = {2025},
  note = {https://doi.org/10.32614/RJ-2025-026},
  doi = {10.32614/RJ-2025-026},
  volume = {17},
  issue = {3},
  issn = {2073-4859},
  pages = {100-124}
}