mcmsupply: An R Package for Estimating Contraceptive Method Market Supply Shares

In this paper, we introduce the R package mcmsupply which implements Bayesian hierarchical models for estimating and projecting modern contraceptive market supply shares over time. The package implements four model types. These models vary by the administration level of their outcome estimates (national or subnational estimates) and dataset type utilised in the estimation (multi-country or single-country contraceptive market supply datasets). The mcmsupply package contains a compilation of national and subnational level contraceptive source datasets, generated by Integrated Public Use Microdata Series (IPUMS) and Demographic and Health Survey (DHS) microdata. We describe the functions that implement the models through practical examples. The annual estimates and projections with uncertainty of the contraceptive market supply, produced by mcmsupply at a national and subnational level, are the first of their kind. These estimates and projections have diverse applications, including acting as an indicator of family planning market stability over time and being utilised in the calculation of estimates of modern contraceptive use.

Hannah Comiskey (Maynooth University) , Niamh Cahill (Maynooth University)
2025-02-01

1 Introduction

Family Planning 2030 (FP2030) is a ‘global movement dedicated to advancing the rights of people everywhere to access reproductive health services safely and on their own terms’ (FP2030 2021). One step towards achieving this goal is to quantify people are accessing their modern contraceptive supplies. To date, obtaining estimates of modern contraceptive supply shares in low- and middle-income countries has relied on large-scale national surveys like the Demographic and Health Surveys (DHS). However, these DHS are not annually available and in practice, most countries carry out DHS every 3 to 5 years approximately, with some countries having fewer surveys than this (The DHS Program - Demographic and Health Survey (DHS) 2023). In previous work, we described a model that provides probabilistic estimates of the contraceptive supply share over time with uncertainty and examined the model performance at the national administration division for countries that are participating in FP2030 and have varying amounts of DHS data available (Comiskey et al. 2024). The original modern contraceptive supply share model (mcmsupply model) relies on splines, informed by cross-method correlations, to capture temporal variation combined with a hierarchical modelling approach to estimate country-level parameters. Using a multi-country dataset, the original mcmsupply model produces estimates of contraceptive supply market shares at the national level for all countries simultaneously. In this paper, we extend the model to estimate supply shares using input data from a single-country and to also include estimation at the subnational administrative division. At the time of writing, this package and models are the first of their kind to estimate and project modern contraceptive method supply shares at the national and subnational administration levels using Demographic and Health survey data.

For the remainder of the paper, we will refer to the original modern contraceptive supply share model as the multi-country national mcmsupply model. The single-country mcmsupply model uses a scaled-down version of the multi-country approach. It borrows strength from the multi-country model using modular model runs with informative priors placed on key parameters to provide precise outcome estimation, even in the absence of data for a particular contraceptive method. Modularization in Bayesian analysis describes the process within a statistical model where information is restricted to flow only from the prior to the likelihood. Thus, preventing the ‘contamination’ of key parameters from suspect data (Lunn et al. (2009) as cited in Plummer (2015)). In the context of our problem, parameters estimated within the multi-country model are used to inform the priors of the single-country (either national or subnational administrative division) models. This approach prevents spurious parameter estimates due to a lack of data for some locations. To summarise how the single-country and multi-country models are connected to each other, Figure 1 depicts this modelling relationship at the national administration level. The main differences between the multi-country and single-country approaches is that for a single-country model, we only have data for one country (at the national or subnational administration division) and the country-level (in the national model) or subnational-level (in the subnational model) population parameters are informed by estimates from the corresponding multi-country model. In contrast to this, the multi-country model uses national or subnational level data (depending on your administrative level of interest) from many countries simultaneously to estimate model parameters and the country-level (in the national model) or subnational-level (in the subnational model) population parameters are estimated hierarchically. The mcmsupply package contains vignettes that consider case studies of both single-country and multi-country model estimation approaches at the national and subnational administration levels.

Figure 1: A flow chart to illustrate the relationship between the multi-country and single-country national estimation approaches. The median estimates of the multi-country national model parameters are used as informative priors in the single-country national model. A legend describing each element of the schematic is located in the blue box.

The need for a single-country version of the mcmsupply model arose for two primary reasons. Firstly, there was a demand for improved computational efficiency while ensuring that model accuracy remains uncompromised when generating projections of modern contraceptive method supply shares for a single-country. Secondly, the option to accommodate custom data, like incorporating a new survey dataset, was sought to be included in the estimation process for users who require a more tailored analysis.

The inclusion of the functionality to estimate supply shares at the subnational level in the mcmsupply package was spurred by the growing interest in subnational estimation among the family planning community (New et al. 2017; Li et al. 2019; Mercer et al. 2019). The decentralization of family planning services produces a more equitable and efficient service; however, it also shifts the responsibility of service delivery to lower-level organisations that may not have the capacity to carry-out the role (Williamson et al. 2014). Providing subnational-level estimates can lead to a clearer understanding of localised user preferences, localised user access to family planning commodities and a measure of the true stability contraceptive supply market at a smaller geographic scale (Bossert and Beauvais 2002). These subnational estimates may also be used as part of a localised temperature check for progress towards the FP2030 goals, which include increasing access to contraceptive methods (Munoz and Ali 2022).

This paper introduces the R package mcmsupply for estimating and projecting the contraceptive supply share at the national and subnational administration divisions using multi-country or single-country datasets. The package uses tidyverse throughout to carry out data manipulation and visualisation tasks. The graphics are produced using ggplot2 and dplyr is used for any data manipulation procedures. The workflow of the package aims to follow a Bayesian workflow (Gelman et al. 2020). The users are encouraged to explore the data, the model inputs, and the model outputs using diagnostics and visualisations. The individual elements of the workflow are made as accessible as possible through the built-in functions, such as the ‘get_data’ and ‘plot_estimates’ functions, and the code provided in the vignettes to test the convergence of the model parameters. To assess convergence, we consider the R-hat values of the model parameters using the plot function of rjags, as well as the individual parameter trace plots (Vehtari et al. 2021a). Figure 2 shows a summary of the different ways users can use the R functions and input data in mcmsupply to carry out model fitting and estimation. The national data contained in the package is derived from the DHS microdata (ICF 2004) while the subnational data is derived from the IPUMS DHS datasets (Herger Boyle et al. 2022). In the ‘Implementation and operation’ section, we review the operation and implementation requirements of the mcmsupply R package. In the ‘Data’ section, a detailed description of the data used within the mcmsupply R package is provided, as well as an explanation of the data pre-processing functions get_data and get_modelinputs. ‘The estimation process’ section describes the model fitting and visualisation functions within the mcmsupply R package. A basic overview of the process model is provided in ‘Overview of the mcmsupply process models’ with an explanation of some key modelling parameters. The ‘Model fitting’ section explains the run_jagsmodel function. The run_jagsmodel function fits models in a Bayesian framework using the JAGS (Just Another Gibbs Sampler) software and produces estimates with uncertainty for the administration level and dataset type of choice (Hornik et al. 2003). The ‘Model output’ section describes the plot_estimates function that takes the model estimates and visualises them using the R package ggplot2 (Wickham 2016). The ‘Use cases’ section discusses five use cases for modelling modern contraceptive supply shares using the mcmsupply R package. Finally, the conclusions are presented.

2 Implementation and operation

mcmsupply contains pre-processing functions to clean and prepare raw input data for model fitting at the administrative level of choice (national or subnational) using the dataset type of choice (multi-country or single-country). This R package includes functions to fit Bayesian hierarchical models using the model inputs. Functionality for post-processing and visualisation of the model estimates are also included. The model fitting process described uses Just Another Gibbs Sampler (JAGS). JAGS uses Markov Chain Monte Carlo (MCMC) sampling to produce model estimates for Bayesian hierarchical models (Hornik et al. 2003). For installation, both R (>=3.5.0) and JAGS (>=4.0.0) are required. JAGS can be downloaded at https://sourceforge.net/projects/mcmc-JAGS/files/JAGS/. mcmsupply interacts with JAGS using the wrapper functions supplied by the R package R2jags (Su and Masanao Yajima 2021). The mcmsupply package dependencies are listed in the package DESCRIPTION file and will be automatically installed upon installing the main package. There are no minimum RAM, CPU, or HARDDRIVE requirements apart from what is necessary to store model runs, which varies case-by-case. This software was run on a MacBook Air using macOS 13.1 with a 1.6 GHz Dual-Core Intel Core i5 processor and 8GB of memory.

3 Data

3.1 Input data

The mcmsupply package contains the following input data sources:

The inst/data-raw folder of the mcmsupply package contains sample R code that was used to create the model inputs in the case of the covariance arrays, correlations and parameters. This enables users to recalculate their own single-country model parameters and correlations should they wish to do so. The code for the creation of the main input contraceptive supply source data is provided on the mcmsupply github page but the raw data for these datasets cannot be provided. The raw data may be accessed by users through an application to the DHS program for national level data, or the IPUMS program, for subnational level data. Help files for the contraceptive supply source datasets can be accessed by using the command ?mcmsupply::national_FPsource_data and ?mcmsupply::subnat_FPsource_data. The national contraceptive supply source data has data for 33 countries (including participants and non-participants of FP2030) between 1990 and 2022. The subnational contraceptive supply source data has data for 256 provinces across 24 countries (all participating in FP2030) between 1990 and 2020. Table ?? is a sample of 6 rows of the subnational contraceptive supply source data.

Lastly, data on country classification, ISO codes and area groupings is provided in the dataset Country_and_area_classification (Table ??). The help file for this dataset can be accessed via the R command ?mcmsupply::Country_and_area_classification.

3.2 Data pre-processing

In mcmsupply, the data processing occurs in two steps: First, the raw input data is retrieved and preliminary cleaning to the dataset is completed. Secondly, the cleaned data is processed to provide the model inputs for the Bayesian hierarchical model. This two-step process removes any black-box element to the model fitting process and allows the user to review the data at both stages and refer to it later when considering model outputs. The first step involves the get_data function. This function retrieves the raw data from the stored contraceptive supply source dataset, does data cleaning and processing to address any issues with missing data and regional naming inconsistencies. Its arguments are summarized in Table ??. The get_data function carries out several data pre-processing tasks. For example, it will remove observations where two or more sectors are missing. It also checks to see if observations across all three sectors sum to one and will replace missing observations with 0 when the total does sum to one. The function transforms the raw data away from the (0,1) boundary as parameter estimates that lie on the distribution limits cause issues with convergence during the model estimation process. Finally, the get_data function imputes any missing standard errors using a binomial approximation. To use the the get_data function, the user first defines whether they wish to use national or subnational level administrative data via the national argument. National level data is accessed when the national argument is set to TRUE. When subnational administrative data is required, the user sets national to FALSE. Similarly, the user defines whether they want to use multi-country estimation with data from multiple countries or single-country estimation with data from a single-country via the local argument. The default setting for the local argument is FALSE. This induces a multi-country estimation, where the outcomes for all the countries in the contraceptive supply source dataset will be estimated simultaneously. In the event of single-country estimation, the user sets local to TRUE and indicates their country of interest via the mycountry argument. The names of the countries listed in the package data can be found in the country_names dataset. The help file for this dataset can be accessed via the R command ?mcmsupply::country_names. The fp2030 argument controls whether to include countries that are participating in the FP2030 initiative or not. The default includes only the named FP2030 countries (see country_names) in the dataset. There is the optional functionality to include a custom dataset. This allows the user to run the model on data outside of that stored within the package. The surveydata_filepath is a character string that denotes the location of the custom dataset. The file must meet a series of internal checks on file type, column names, suitable data ranges and missing data. When a custom dataset is supplied to get_data, the function carries out the checks and alerts the user to any differences between what is expected and what has been supplied. If surveydata_filepath is left as NULL, by default the function uses the stored national_FPsource_data or subnat_FPsource_data, depending on what administrative level the user has specified via the national argument (Table ??). The get_data function returns a list containing the cleaned data and a list of arguments supplied to the function. Storing the arguments of the get_data function allows the set-up information to flow without requiring the user to repeatedly supply the same arguments for each step of the modelling process.

Step two of the data pre-processing is the get_modelinputs function. This function takes the cleaned data from the previous step and repackages it into suitable inputs for the model implementation. The arguments of the function are summarised in Table ??. This function uses the arguments set in the get_data function as well as additional parameters for the model. These parameters include the year the user wishes to begin their estimation at and the year they finish on. In the mcmsupply package, the models use basis splines (B-splines), to capture the complexities in variation of the contraceptive supply source data over time. The use of splines for the estimation of demographic indicators is growing in popularity. Previous studies used small-area estimation models with splines to capture complex shapes of demographic spatio-temporal data (Ugarte et al. 2009; Ugarte et al. 2010). In recent years, many international health organisations have also used splines for the estimation of key demographic indicators. These include the estimation and projection of under-5 mortality for United Nations Children’s Fund (UNICEF) (Sharrow et al. 2019) and the estimation of excess morality due to Covid-19 for the World Health Organisation (WHO) (Knutson et al. 2023). We build on these previous studies to use penalised regression splines in the mcmsupply R package. B-splines use basis-functions to create piecewise cubic polynomials. The number of basis functions that are fit to the data is determined by the number of knots. Knots are the locations along the x-axis where the piecewise polynomials of the B-splines join. As you increase the number of knots in the basis functions, the B-splines give a tighter fit to the data. Similarly, if you decrease the number of knots in the basis, you will get a smoother fit to your data. In the mcmsupply package, the user may alter the number of knots (nsegments) used in the basis functions. The default number of knots is 12, as was used in Comiskey et al. (2023). This equates to a knot approximately every 3.5 years. Through validation, it was found that having fewer knots dis-improved model fit, while more knots did not significantly improve model fit. Like the get_data function, this function returns a list containing the model inputs and the function arguments.

4 The estimation process

The mcmsupply R package contains four Bayesian models, each of which aims to estimate and project contraceptive method supply shares over time with uncertainty. These models vary by the administration level of their outcome estimates (national or subnational estimates) and dataset type utilised in the estimation (multi-country or single-country contraceptive market supply datasets). A full mathematical description of all the models contained within mcmsupply is described in the supplementary material. 1. A summary of each of the parameters and their role within each model can be found in Table ?? while a visual summary of the national model, using both multi-country and single-country inputs, can be found in Figure 1.

4.1 Brief model overview

The outcome of interest is the components of a compositional vector \(\boldsymbol{{\phi_{q,t,m}}}\), which captures the proportion of contraceptive method m, at time t, in population q supplied across the public and private sectors.

Let, \[\begin{equation} \boldsymbol{{\phi_{q,t,m}}} =(\phi_{q,t,m,1},\phi_{q,t,m,2},\phi_{q,t,m,3}), \tag{1} \end{equation}\]

where, \(\phi_{q,t,m,s}\) is the proportion supplied by the public sector (s =1), the private commercial medical sector (s=2) and the other private sector (s=3) of modern contraceptive method m, at time t, in population q (national or subnational).

Figure ?? shows the model set up for the national level models. A similar approach is taken when estimating modern contraceptive method supply at the subnational administration level. For each model within the mcmsupply package, the latent variable \(\psi_{q,m,t,s}\) relies on a spline to capture the underlying process that generates the data, on the logit scale, for sector s, in year t, for method m and population q (depending on the administration level of interest) .

\[\begin{equation} \psi_{q,t,m,1}=\sum_{k=1}^K \beta_{q,m,1,k} B_{q,k}(t), \tag{2} \end{equation}\]

where, \(\beta_{q,m,1,k}\) is the \(k^{th}\) spline coefficient for sector s, method m in population q. \(B_{q,k}(t)\) is the \(k^{th}\) basis function fit to the data for population q.

We assume that in population q, for method m and sector s, the value of spline coefficient at knot index \(k^*\), aligning with the year \(t^*\), the most recent survey available, is \(\alpha_{q, m, s}\). By doing this, we are assuming that the \(\alpha_{q,m,s}\) parameter will act as the spline coefficient for the reference spline at \(k^*\). We are then able to calculate the remaining spline coefficients using a random walk model of order 1 on spline coefficients from the reference index (\(k^*\)) using the penalised \(\boldsymbol{\delta_{q,m,s}}\).

Let, \[\begin{equation} \beta_{q,m, s, k}=\left\{\begin{array}{c} \alpha_{q,m,s} \quad k=k^*, \\ \beta_{q,m,s, k+1}-\delta_{q, m, s, k} \quad k<k^*, \\ \beta_{q,m,s, k-1}+\delta_{q, m, s, k-1} \quad k>k^*, \end{array}\right. \tag{3} \end{equation}\]

where, \(\alpha_{q,m,s}\) is the most recently observed supply share, on the logit scale, for sector s , method m, in population q. This parameter is estimated hierarchically. The geographical set-up of this estimation process adapts to match the administrative level of interest. For example, the subnational multi-country models contain an additional layer of geography (world > subcontinent > country > province) in the hierarchical set-up that the national models don’t have, which accounts for the subnational administration levels. k is the knot index along the set of basis splines \(B_{q,k}(t)\) \(k^*\) is the index of the knot that corresponds with \(t^*\), the year index where the most recent survey occurred in population q. \(\delta_{q,m,s,k-1}\) is the first order difference between spline coefficients \(\beta_{q,m, s, K}\) and \(\beta_{q,m, s,K-1}\). These reflect the changes in method supply shares over time. Within each sector, the first-order differences are assumed to be correlated between methods. These correlations were estimated using a maximum estimator for the correlation matrix first described in Azose and Raftery, (2018) and adapted for method supply shares in Comiskey et al. (2023). These estimated correlations are available as data for both the national and subnational models. Please see the Data section of this paper for more details.

4.2 Model fitting

The run_jags_model function fits the selected JAGS model to the supplied data and returns a list of MCMC samples and point summaries for the time-period and locations of interest. Initial set-up arguments (national, local, mycountry) are inherited from the specification of the previous get_modelinputs function. The additional inputs of this function are summarised in Table ??. The jagsdata argument of this function is a list of initial set-up arguments and JAGS model inputs gathered from the get_modelinputs function. jagsparams is a vector of strings that name the model parameters the user wishes to monitor within the JAGS model. The default is NULL. When jagsparams = NULL, the function will refer to a stored vector of parameters to monitor (Table ??). The JAGS parameters of n_iter = 80000, n_burnin = 10000 and n_thin = 35 ensure that the model has converged and that the final posterior sample size is 2000 samples. The function get_point_estimates takes the chains produced by the JAGS model and estimates the median, 95% credible intervals and 80% credible intervals. The get_point_estimates function runs automatically inside the run_jags_model function and returns the point summaries as part of the run_jags_model function output. The run_jags_model function returns a list containing the JAGS output of the model and the point summaries for the estimates.

4.3 Model output

The user then runs the function plot_estimates. This function visualises the point estimates with uncertainty alongside the data using the initial set up inputs of the get_modelinputs function and the output of the run_jags_model function (Table ??). The plot_estimates function returns a list of ggplot2 objects, one for each country (when using the national model) or subnational region (when using the subnational model).

Figure 2: A flow chart to illustrate the decision processes that lead to the different estimation types within the mcmsupply package. The first decision is with respect to the administrative level of the estimates you wish to create – they may be either national level or subnational level. An explanation of each division is found on the first row of the figure. The second decision is with respect to the number of countries you wish to estimate – the user may estimate the proportions for all the countries at once or only one country. An explanation of each in the context of the specific administrative division is found on the second row of the figure. A set of sample functions used to estimate and plot the estimates for each modelling option are located on the third row of the figure

5 Use cases

5.1 Case 1: Estimating contraceptive method supply shares at the national administration level for multiple countries simultaneously

The first use case describes how the user can estimate modern contraceptive method supply shares at the national administrative level over time for multiple countries at once (i.e., using the multi-country national model). This use case is described in national_multicountry_mod found in the vignettes folder. This vignette takes approximately 12 hours to run when using the recommended JAGS arguments, on a machine with 1.6 GHz Dual-Core Intel Core i5 processor and 8GB of RAM. The national_FPsource_data dataset contains observations for 30 countries. The user begins by accessing the national_FPsource_data dataset through the get_data function with the argument national=TRUE, and the remaining arguments sets to their default values, to indicate that they are interested in national-level data for the FP2030 countries present in the data.

cleaned_natdata <- get_data(national = TRUE)

dplyr::glimpse(cleaned_natdata$mydata)

Next, this data is supplied to the get_modelinputs function. This function reshapes the data into a list of inputs for the JAGS model. At this point, the user must indicate the start and end years they wish to estimate between. The n_segments argument controls how many knots will be used in the basis functions. The default number of segments is 12. Lastly, the cleaned national data from the get_data function is provided to the get_modelinputs function via the raw_data argument.

pkg_data <- get_modelinputs(startyear = 1990, 
                            endyear = 2025.5, 
                            nsegments = 12, 
                            raw_data = cleaned_natdata)

This list of data and model inputs is then fed into the JAGS model via the run_jags_model function. In this instance, the user wishes to monitor the default set of parameters within the JAGS model. Therefore, they set the jagsparams argument to NULL, which invokes the function to use the default list. The parameters for running the JAGS model are set via the n_iter, n_burnin and n_thin arguments. For demonstration purposes in the code below, these arguments are set to low values. However, for optimal model convergence we recommend increasing the iterations to 80,0000, the burn-in period to 10,000, and the thinning to 35. Please note that for these increased values of arguments, the model run time is approximately 12 hours. As part of the run_jags_model function, the median and 80% and 95% credible intervals for the estimates are calculated. To assess convergence, we considered the convergence diagnostic \(\hat{R}\), as well as the individual parameter trace plots (Vehtari et al. 2021b).

mod <- run_jags_model(jagsdata = pkg_data, 
                      jagsparams = NULL,
                      n_iter = 80, 
                      n_burnin = 10, 
                      n_thin = 2)
plot(mod$JAGS)

sample_draws <- tidybayes::tidy_draws(mod$JAGS$BUGSoutput$sims.matrix)

var <- sample_draws %>% dplyr::select(.chain, .iteration, .draw,`P[1,2,1,1]`) %>%
  dplyr::mutate(chain = rep(1:2, each=mod$JAGS$BUGSoutput$n.keep)) %>%
  dplyr::mutate(iteration = rep(1:mod$JAGS$BUGSoutput$n.keep, 2))

ggplot2::ggplot(data=var) +
  ggplot2::geom_line(ggplot2::aes(x=iteration, y=`P[1,2,1,1]`, color=as.factor(chain)))

The final JAGS model output and the summary estimates are returned as a list. Finally, these summary estimates are visualised via the plot_estimates function using the R package ggplot2 (Figure 3).

plots <- plot_estimates(jagsdata = pkg_data, 
                        model_output = mod) 
mcmsupply functions." width="100%" height="40%" type="application/pdf">

Figure 3: The plotted posterior point estimates for each of the three sectors (public in blue, private commercial medical in grey, and private other in gold) for Nepal at the national administrative level over time with the 80% and 95% uncertainty interval denoted as shaded regions. Time in years is on the x-axis and the proportion of each contraceptive method supply share is on the y-axis. The survey observations are plotted as points with their associated standard error, plotted as vertical lines. This plot was produced using a multi-country set up of the mcmsupply functions.

If the user wishes to pull out specific estimates for a given country and year, they may do so using the pull_estimates function. The model object, country name and the year of interest are supplied, the function then returns a tibble of the corresponding estimates.

estimates_2018 <- pull_estimates(model_output = mod, country = 'Nepal', year=2018)

5.2 Case 2: Estimating contraceptive method supply shares at the national administration level for a single-country

This case considers when the estimates at the national administration level are required for only one country. Rather than running a multi-country model, which takes several hours, a quicker alternative is the single-country approach, which takes only a few minutes. The main difference between the multi-country and single-country model outputs is that the model estimates of the single-country models have slightly larger uncertainty. This is especially evident where data is absent for a particular method. For example in Figure 3, the width of the 95% credible intervals over time for implants estimated by the multi-country national model are smaller than those estimated in the single-country model (Figure 4). The arguments local and mycountry control the single-country estimation models in mcmsupply. The user begins as by retrieving the data for Nepal only using the get_data function. They set local=TRUE and specify which country they are interested in by setting mycountry=Nepal.

cleaned_natdata <- get_data(national = TRUE, 
                            local = TRUE,
                            mycountry = "Nepal")

These arguments are the only discernible differences in the commands for users, the rest of the workflow is as described above in Case 1. A complete workflow for this use case can be found in vignettes/national_singlecountry_mod. The single-country model produces model estimates that align with those estimated by the multi-country estimation model.

mcmsupply functions." width="100%" height="40%" type="application/pdf">

Figure 4: The plotted posterior point estimates for each of the three sectors (public in blue, private commercial medical in grey, and private other in gold) for Nepal at the national administrative level over time with the 80% and 95% uncertainty interval denoted as shaded regions. Time in years is on the x-axis and the proportion of each contraceptive method supply share is on the y-axis. The survey observations are plotted as points with their associated standard error, plotted as vertical lines. This plot was produced using a single-country set up of the mcmsupply functions.

5.3 Case 3: Estimating contraceptive method supply shares at the subnational administration level for multiple countries simultaneously

The use case for estimating the contraceptive supply shares via a multi-country model for the subnational administration division is given by the vignette subnational_multicountry_models. This vignette takes approximately 24 hours to run on a machine with 1.6 GHz Dual-Core Intel Core i5 processor and 8GB of RAM. The dataset contains observations for 225 subnational divisions, across 23 countries. The user begins by calling the multi-country dataset at the subnational administration level via the national argument.

cleaned_subnatdata <- get_data(national = FALSE)

The remaining workflow is the same as described above in Case 1 and is not shown here. A complete workflow for this use case can be found in vignettes/subnational_multicountry_models.

5.4 Case 4: Estimating contraceptive method supply shares at the subnational administration level for a single-country

This use case is for considering use of the single-country model at the subnational administrative division. The user begins by retrieving the data for Nepal using the get_data function by setting the arguments national=FALSE, local=TRUE and mycountry=‘Nepal’.

cleaned_data <- get_data(national = FALSE, 
                         local = TRUE,
                         mycountry = "Nepal")

As in the previous use cases, the JAGS model is run and point summaries are calculated via the run_jags_model function. The visualisations for the subnational regions of Nepal are returns as a list via the plot_estimates function. An example of these visualisations is given in Figure 5, where the estimated median method supply shares for Central region of Nepal are plotted with 80% and 95% credible intervals over time in each of the methods.

Figure 5: The plotted posterior point estimates for each of the three sectors (public in blue, private commercial medical in grey, and private other in gold) taken from the subnational single-country population for the Central subnational region of Nepal over time with both 80% and 95% uncertainty denoted as corresponding shaded regions. Time in years is on the x-axis and the proportion of each contraceptive method supply share is on the y-axis. The survey observations are plotted as points with their associated standard error, plotted as vertical lines.

5.5 Case 5: Estimating contraceptive method supply shares at the national/subnational administration level for a single-country using custom data

It is possible to include custom datasets when estimating contraceptive method supply shares at either the national or subnational administration level. The set-up for using custom data is very similar to the above processes, with a small difference in the data retrieval step using the get_data function. When using a custom dataset, the user defines the location of the .xlsx file containing the custom data.

cleaned_data <- get_data(national = FALSE, 
                         local = TRUE, 
                         surveydata_filepath = 
                           "inst/data-raw/my_custom_data_good.xlsx",
                         mycountry = "Ethiopia")

The file must be in .xlsx format and match the layout of either the national_FPsource_data or subnat_FPsource_data datasets, depending on your desired administration level. The get_data function will carry out a series of internal checks to ensure the custom data matches the stored data layout. If the custom data is not suitable, the get_data function will return an error and a message to the user describing the issue with the custom data. A description of each column is given in Table ??. Once the custom data checks are complete and passed, the regular workflow for fitting an mcmsupply model continues. The JAGS model inputs are retrieved using the get_modelinputs and the JAGS model is fit using run_jags_model. The summary estimates are plotted using the plot_estimates function without any further changes to the workflow. A vignette of how to run the subnational model using a custom dataset can be found vignettes/subnational_singlecountry_customdata_models.

6 Discussion

In this paper we have introduced the package mcmsupply. The primary purpose of this package is to estimate and project modern contraceptive method supply shares from the public, private commercial medical and private other sectors with uncertainty, for a selection of countries participating in the FP2030 initiative. The mcmsupply package produces estimates within the period of available Demographic and Health Survey data for a given country, as well as projections beyond the most recent data point. The package uses either stored DHS survey data or custom user-supplied survey data as inputs to the modelling process. The package implements four model types. These models vary by the administration level of their outcome estimates (national or subnational administration level) and dataset type utilised in the estimation (multi-country or single-country contraceptive market supply datasets). The modelling framework uses penalised splines to capture temporal nature of the data. These splines utilise correlations between changes in method supply shares that exist within the data. Using splines informed by cross-method correlations allows us to capture the complex shape of the data without over-fitting. Bayesian hierarchical estimation is another key element of this model estimation process. We take advantage of the geographical nature of the data, such that the expected sector, province (subnational level) or country (national level), method-specific supply shares are informed based on the existing geographical structures in the data. This promotes information sharing across locations, and better inform estimates in areas where smaller amounts of data are present. Case studies illustrate how to use the mcmsupply for each of the four potential modelling routes, as well as how to estimate contraceptive method supply shares using custom user-supplied data.

The mcmsupply package has many benefits to users. Firstly, it is the first of its kind to produce annual estimates with uncertainty for monitoring contraceptive method supply shares over time at the national and subnational levels. On average, most countries carry out DHS surveys every 5-6 years, but in some instances the wait time between surveys may be even longer (The DHS Program - Demographic and Health Survey (DHS) 2023). The family planning community use contraceptive method supply shares to evaluate the stability and sustainability of a given country’s contraceptive market (Bradley and Shiras 2022). Contraceptive method supply share estimates are also pivotal in effectively managing contraceptive commodity supply-chains through a ‘total market approach’ (TMA). A TMA approach to the family planning supply market seeks to engage the public, private commercial, and other sectors in a country to increase family planning users access to vital information, products, and services (Moazzam 2015; Private Sector (SHOPS) Plus 2016). Prior to this package, individuals who required contraceptive method supply shares for a given country relied on estimates from the most recent DHS survey in the country, regardless of its age. The mcmsupply package alleviates this issue and provides the family planning community annual estimates with uncertainty for these contraceptive method supply shares. Secondly, contraceptive method supply shares estimates are not only a stand -alone family planning indicator but are also used in the calculation of an another indicator, estimated modern use (EMU) (Track20 2020b). EMUs aim to measure the proportion of women, aged 15 to 49 years old , who are currently using any modern method of contraception, and are derived from routinely collected family planning service statistics (Track20 2020a). Currently, EMU calculations depend on an adjustment that relies on the most recent DHS survey to provide estimates of the contraceptive supply share market in a given country. Now this adjustment can instead rely on the annual estimates and projections with uncertainty produced by the mcmsupply package. This can serve to improve the overall accuracy of EMUs with respect to their ability to accurately measure modern contraceptive use. In addition, given the probabilistic nature of the mcmsupply estimates the associated uncertainty can be propagated into the EMU calculations.

The last key benefit of the mcmsupply package is the speed at which the user is able to access individual countries supply share estimates. Using the single-country models, at either the national or subnational level, provides users with annual estimates of the method supply shares with uncertainty within minutes. This fast estimation approach is computationally efficient while still producing reliable estimates. Using priors informed by multi-country model subcontinental and country-level median parameter estimates, this modelling approach is robust to spurious parameter estimates even when estimating supply shares for countries with fewer surveys available. This computational efficiency without a loss of model accuracy makes the mcmsupply package user-friendly and efficient for regular data analysis.

The mcmsupply is not without its limitations. We remark that the implementation of these models in JAGS have not been optimised. The multi-country models at the national and subnational levels take hours to run and are intensive on computer memory and CPU. Hence, improvements such as matrix operations rather than for-loops would greatly improve the computation efficiency of this package. Another limitation of this package is that in methods without any survey information, the uncertainty intervals tend to be large. This is especially evident in the single-country models where in the absence of data for a given method, the uncertainty of the associated estimates is even larger than that of the corresponding multi-country model estimates. These limitations inspires our future work, where we seek to improve the computational efficiency of these models. We would also like to investigate the potential for incorporating additional covariates into the models, such as average method pricing for each sector, to improve the uncertainty of model estimates and projections where no DHS survey data is available. Lastly, we wish to design an R-shiny app that promotes the use of these model estimates among statistical non-experts within the family planning community.

7 Conclusions

mcmsupply is an R package that estimates the modern contraceptive method supply shares at the national and subnational administrative divisions over time for countries participating in the Family Planning 2030 initiative. The package provides the user an easy and accessible way to produce annual estimates with uncertainty using Bayesian hierarchical penalised spline models with cross-method correlations at the national and subnational administration levels. These annual estimates with uncertainty may act as a stand-alone family planning indicator of the stability of the modern contraceptive supply market or be used to produce alternative family planning indicators, such as estimated modern use (EMUs) using service statistics (Track20 2020b). To the best of our knowledge, the package is the first of its kind to estimate these supply shares at both administrative levels. Using an R package to disseminate this work aligns with the findability, accessibility, interoperability, and reusability (FAIR) principles of scientific data (Wilkinson et al. 2016). The data used in the package is cited and explained thoroughly, the code is commented and easy to understand should a user wish to tweak or review any functionalities, and finally it is reusable by the very nature of the R package.

7.1 Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2024-020.zip

7.2 CRAN packages used

mcmsupply, tidyverse, ggplot2, dplyr, rjags, R2jags

7.3 CRAN Task Views implied by cited packages

Bayesian, ChemPhys, Cluster, Databases, Environmetrics, GraphicalModels, MixedModels, ModelDeployment, Phylogenetics, Spatial, TeachingStatistics

T. J. Bossert and J. C. Beauvais. Decentralization of health systems in Ghana, Zambia, Uganda and the Philippines: a comparative analysis of decision space. Health Policy and Planning, 17(1): 14–31, 2002. URL https://doi.org/10.1093/heapol/17.1.14.
S. E. K. Bradley and T. Shiras. Where Women Access Contraception in 36 Low- and Middle-Income Countries and Why It Matters. Global Health: Science and Practice, 10(3): 2022. URL https://www.ghspjournal.org/content/10/3/e2100525.
H. Comiskey, L. Alkema and N. Cahill. Estimating the proportion of modern contraceptives supplied by the public and private sectors using a Bayesian hierarchical penalized spline model. Journal of the Royal Statistical Society Series A: Statistics in Society, 2024. URL https://doi.org/10.1093/jrsssa/qnae051.
FP2030. Rights and Empowerment Principles for Family Planning. FP2030 | United Nations Foundation. 2021. URL https://commitments.fp2030.org/principles.
A. Gelman, A. Vehtari, D. Simpson, C. C. Margossian, B. Carpenter, Y. Yao, L. Kennedy, J. Gabry, P.-C. Bürkner and M. Modrák. Bayesian workflow. 2020. URL http://arxiv.org/abs/2011.01808.
E. Herger Boyle, M. King and M. Sobek. IPUMS-Demographic and Health Surveys: Version 9 [dataset]. 2022. URL https://doi.org/10.18128/D080.V9.
K. Hornik, F. Leisch, A. Zeileis and M. Plummer. JAGS: A Program for Analysis of Bayesian Graphical Models Using Gibbs Sampling. 2003. URL http://www.ci.tuwien.ac.at/Conferences/DSC-2003/.
ICF. Demographic and Health Surveys (various) [Datasets]. 2004.
V. Knutson, S. Aleshin-Guendel, A. Karlinsky, W. Msemburi and J. Wakefield. Estimating global and country-specific excess mortality during the Covid-19 pandemic. The Annals of Applied Statistics, 17(2): 1353–1374, 2023. URL https://doi.org/10.1214/22-AOAS1673.
Q. Li, T. A. Louis, L. Liu, C. Wang and A. O. Tsui. Subnational estimation of modern contraceptive prevalence in five sub-Saharan African countries: A Bayesian hierarchical approach. BMC Public Health, 19(1): 2019. DOI 10.1186/s12889-019-6545-3.
D. Lunn, N. Best, D. Spiegelhalter, G. Graham and B. Neuenschwander. Combining MCMC with ’sequential’ PKPD modelling. Journal of pharmacokinetics and pharmacodynamics, 36(1): 19–38, 2009. URL https://pubmed.ncbi.nlm.nih.gov/19132515/.
L. D. Mercer, F. Lu and J. L. Proctor. Sub-national levels and trends in contraceptive prevalence, unmet need, and demand for family planning in Nigeria with survey uncertainty. BMC Public Health, 19(1): 2019. DOI 10.1186/s12889-019-8043-z.
A. Moazzam. Ensuring contraceptive security through effective supply chains: WHO/RHR/17.09. 11–13 pages. World Health Organization, United Nations Population Fund; http://www.popcouncil.org/uploads/pdfs/FP{_}Evidence{_}supply{_}chains{_}FINAL{_}07.10.17.pdf. 2015. Accessed: 2023-03-30.
M.-F. Munoz and M. Ali. Commitment to FP2030. 2022. URL https://cdn.who.int/media/docs/default-source/reproductive-health/who-srh-fp2030.pdf?sfvrsn=780d62cb_3.
J. R. New, N. Cahill, J. Stover, Y. P. Gupta and L. Alkema. Levels and trends in contraceptive prevalence, unmet need, and demand for family planning for 29 states and union territories in India: a modelling study using the Family Planning Estimation Tool. The Lancet Global Health, 5(3): e350–e358, 2017. URL http://dx.doi.org/10.1016/S2214-109X(17)30033-5.
M. Plummer. Cuts in Bayesian graphical models. Statistics and Computing, 25(1): 37–43, 2015. URL https://dl.acm.org/doi/10.1007/s11222-014-9503-z.
S. H. O. through the Private Sector (SHOPS) Plus. Total market approach compendium. 2016. URL https://shopsplusproject.org/tmacompendium.
D. Sharrow, L. Hug, P. Danzhen You, P. Leontine Alkema, R. Black, S. Cousens, T. Croft, V. Gaigbe-Togbe, P. Gerland, M. Guillot, et al. National, regional, and global levels and trends in neonatal mortality between 1990 and 2017, with scenario-based projections to 2030: a systematic analysis. The Lancet Global Health, 7(6): e710–e720, 2019. URL http://dx.doi.org/10.1016/S2214-109X(19)30163-9.
Y.-S. Su and Masanao Yajima. R2jags: Using r to run JAGS.” 2021. URL https://CRAN.R-project.org/package=R2jags. R package version 0.7-1.
The DHS Program - Demographic and Health Survey (DHS). 2023. URL https://dhsprogram.com/methodology/survey-Types/dHs.cfm [online; last accessed August 31, 2022].
Track20. Monitoring Progress in Family Planning Estimated Modern Use (EMU): A New Service Statistics-Based Family Planning Indicator. 2020a. URL http://fpet.track20.org/fpet/.
Track20. SS to EMU Tool: Process Guide. Avenir Health. 2020b. URL http://www.track20.org/pages/track20_tools/SS_to_EMU_tool.php.
M. D. Ugarte, T. Goicoa and A. F. Militino. Spatio-temporal modeling of mortality risks using penalized splines. Environmetrics, 21(3-4): 270–289, 2010. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/env.1011.
M. D. Ugarte, T. Goicoa, A. F. Militino and M. Durbán. Spline smoothing in small area trend estimation and forecasting. Computational Statistics & Data Analysis, 53(10): 3616–3629, 2009. URL https://EconPapers.repec.org/RePEc:eee:csdana:v:53:y:2009:i:10:p:3616-3629.
A. Vehtari, A. Gelman, D. Simpson, B. Carpenter and P. C. Burkner. Rank-Normalization, Folding, and Localization: An Improved (Formula presented) for Assessing Convergence of MCMC (with Discussion)*†. Bayesian Analysis, 16(2): 667–718, 2021a. DOI 10.1214/20-BA1221.
A. Vehtari, A. Gelman, D. Simpson, B. Carpenter and P.-C. Bürkner. Rank-normalization, folding, and localization: An improved Rˆ for assessing convergence of MCMC (with discussion). Bayesian Analysis, 16(2): 2021b. URL http://dx.doi.org/10.1214/20-BA1221.
H. Wickham. ggplot2: Elegant graphics for data analysis. Springer-Verlag New York, 2016. URL https://ggplot2.tidyverse.org.
M. D. Wilkinson, M. Dumontier, Ij. J. Aalbersberg, G. Appleton, M. Axton, A. Baak, N. Blomberg, J. W. Boiten, L. B. da Silva Santos, P. E. Bourne, et al. Comment: The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3: 2016. DOI 10.1038/sdata.2016.18.
T. Williamson, S. Duvall, A. A. Goldsmith, K. Hardee and R. Mbuya-Brown. The Effects of Dencentralization on Family Planning: A Framework for Analysis. Futures Group, Health Policy Project. 2014. URL https://www.healthpolicyproject.com/pubs/445_FPDecentralizationFINAL.pdf.

  1. At the time of publication it is expected that this material will be made available via another open access venue.↩︎

References

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Comiskey & Cahill, "mcmsupply: An R Package for Estimating Contraceptive Method Market Supply Shares", The R Journal, 2025

BibTeX citation

@article{RJ-2024-020,
  author = {Comiskey, Hannah and Cahill, Niamh},
  title = {mcmsupply: An R Package for Estimating Contraceptive Method Market Supply Shares},
  journal = {The R Journal},
  year = {2025},
  note = {https://doi.org/10.32614/RJ-2024-020},
  doi = {10.32614/RJ-2024-020},
  volume = {16},
  issue = {2},
  issn = {2073-4859},
  pages = {1}
}