HostSwitch: An R Package to Simulate the Extent of Host-Switching by a Consumer

In biology a general definition for host switch is when an organism (consumer) uses a new host (which represents a resource). The host switch process by a consumer may happen through its pre-existing capability to use a sub-optimal resource. The HostSwitch R package provides functions to simulate the dynamics of host switching (extent and frequency) in the population of a consumer that interacts with current and potential hosts over the generations. The HostSwitch package is based on a Individual-Based mock-up model published in FORTRAN by Araujo et al. (2015). The package largely improve the previous mock-up model, by implementing numerous new functionalities such as comparison and evaluation of simulations with several customizable parameters to accommodate several types of biological consumer-host associations, an interactive visualization of the model, an in-depth description of the parameters in a biological context. Finally, we provided three real world scenarios taken from the literature selected from ecology, agriculture and parasitology. This package is intended to reach researchers in the broad field of biology interested in simulating the process of host switch of different types of symbiotic biological associations.

Valeria Trivellone (University of Illinois at Urbana-Champaign) , Sabrina B. L. Araujo (Universidade Federal do Paraná) , Bernd Panassiti (Independent researcher)
2023-02-10

1 Introduction

In several branches of biology (such as for example ecology, evolution, parasitology) a general definition for host-switching (or host shift) is when a consumer uses a newly colonized host, which represents its resource. Different spatial and temporal outcomes may result from the new host-consumer association depending on whether or not the colonization is successful. Studies on the evolution of biotic associations relies on an increasing body of literature covering all prototypical examples of symbiotic relationship categories (Thompson 2010). Symbiosis sensu lato is defined here as any interaction between two organisms of different species. The possible influence between interacting organisms has been placed in a continuum of association types defined by their role, direction, and extension; and it varies from mutual consumption to unidirectional exploitation (Dimijian 2000a; Dimijian 2000b). In the interspecific associations, the interacting organisms may play the role of strict consumer (e.g., predator) or resource (e.g., prey). Evolution of the associations may include several concatenated events of speciation affecting one or both species and it is driven by four main processes: cospeciation, host switch, failure to speciate and "missing the boat" (Page 2002). While the prevalent paradigm considers cospeciation to be the main process driving evolution of most biological associations, recent evidence showed that, given the opportunity, a consumer may use a sub-optimal resource (or host) by host-switching without the need for any genetic innovation. This may explain the rapid origin of novel associations (i.e. colonization of novel hosts at the ecological time scale) eventually followed by speciation (at the evolutionary time) as well the observed incongruences of the paired phylogenies (see Brooks et al. (2019) for a review). Computer simulation modeling provide a valid tool to understand ecological and evolutionary dynamic of interacting species. To theoretically support the importance of host switch events, an Individual-Based Model (IBM) has been proposed by Araujo et al. (2015) (mock-up model hereafter). The model simulates the extent of host-switching in a host-parasite association formalized as the probability of an individual to disperse and successfully colonize a novel host. Recently, Feronato et al. (2021) provided a further add-on to the model by exploring the significance and the interaction of three parameters thought to be of paramount importance for the acquisition of a new host by a parasite. By using simulated data, these two initial papers provided important insights on the dynamic of host-switches for a parasite species. The main results were that host switch on a new host does not require prior evolutionary novelty, pathogens may survive on sub-optimal hosts which results in increased chance of host-switching to hosts more distant, and some parameters facilitate host switching (e.g., mutation and reproductive rate).

Both models were coded and run in FORTRAN language. Although available in an executable version (Windows, Linux, and MacOS), the previous mock-up model still lacks a user-friendly interface, and the manipulation of parameters is broadly limited. The current access to the model is restricted to certain groups of users who know how to compile and code in FORTRAN.

We present here a user-friendly R package, called HostSwitch, which improves the earlier version published in FORTRAN (Araujo et al. 2015; Feronato et al. 2021) in three main ways:

(1) by increasing the accessibility of the model to researchers that are not familiar with FORTRAN;

(2) by including customizable parameters, some previously unreleased (e.g., \(jump\_back\)), that allow us to extend the mock-up model from strict pathology (i.e., simulating host switches by pathogen) to other branches of biology such as ecology, microbiology, agriculture (covering a very broader spectrum of symbiotic sensu lato associations, see Implementation and Usage scenarios sections for further details) reaching a broader audience;

(3) by providing in-depth descriptions of the parameters in a biological context, and examples of possible uses with real world data (see Usage scenarios section).

To our knowledge, there are no R packages that simulate the events of host switch using the theoretical approach briefly presented above and widely discussed in previous papers (Agosta et al. 2010; Araujo et al. 2015; Brooks et al. 2019). However, is worth to mention that there are one putatively similar packages that may be used to simulate host switches. Notably, the package EpiILM (Vineetha Warriyar et al. 2020) uses discrete-time individual-level models to simulate the dynamic of infection disease transmission. The EpiILM package differs from the one present here for a fundamental point: HostSwitch simulate the host switch using the point of view of the consumer, rather than the host, and formalize the probability as random encounters of new hosts different to one other. For further details of the model formalize in EpiILM are presented in Deardon et al. (2010).

In Section "The model", we describe the mathematical framework for simulating the dispersion and the survival of individuals of a population on a novel resource which describes the event of host-switching. In Section "Implementation", we describe the implementation of the mock-up model in the HostSwitch package, including a description of the arguments of the main functions. In Section "Usage scenarios" we used empirical data gathered from the literature to simulate and compare different scenarios of host switch. We also present possible hypotheses and research questions that can be explored using the simulation approach.

2 The model

The simulation model of HostSwitch aims to measure the dynamics of host switching (extent and frequency) in the population of an organism (hereafter Consumer) that interacts with current and potential hosts (hereafter Resource) over generations. A successful host switch implies that a Consumer may colonize a new Resource, which in turn imposes selection pressure that impacts the Consumers’ survival. The host-switching relies on a mechanism of ecological readjustment or ecological fitting, i.e. the capability of the Consumer to use a similar Resource even if sub-optimal (Janzen 1985; Agosta and Klemens 2008). The fundamental aspect of the HostSwitch simulation model is to track, summarize and compare the dispersion and successful host switch events in a new Resource by the populations of the Consumer. Although the model and the basic parameters have been previously described in Araujo et al. (2015) and Feronato et al. (2021), we provide here a revised description of the modeling dynamic which accommodates all symbiotic (sensu lato) associations.

The Resource is characterized by a real number, \(p_{Opt}\), randomly selected from a uniform distribution ranging from \(p_{Res\_min}\) to \(p_{Res\_max}\). This number represents the optimum phenotype imposed on Consumers by the Resource. Besides, the Resource imposes a carrying capacity K on the Consumer population. Individuals of the Consumer have a phenotype which can evolve over generations due to the emergence of novelties (hereafter mutations). Each individual consumer \(i\) is characterized by one phenotype \(p_{i}\). The simulation starts (generation n = 0) with all M consumer individuals having the same phenotype value (\(p_{i}\) = \(p_{Ind}\)) which is equal to the average value in the resource range ((\(p_{Res\_min}\) + \(p_{Res\_max}\))/2). At each generation, a novel Resource is offered and all Consumers have a probability \(mig\) to migrate from the current to the novel Resource. The number of dispersing individuals is calculated by assigning to each individual a value that follows the random uniform distribution. All the individuals with a value lower than \(mig\) disperse to the new Resource. Then, the assigned value may be interpreted ecologically as intrinsic or extrinsic characteristics involved in the dispersal event (e.g., morphological features, environmental constraints). The parameter \(mig\) defines a criterion for inclusion, with higher values allowing more individuals to disperse at each generation. The dispersion event has two possible outcomes: no migration (m1) and migration (m2).

All other individuals are ignored in subsequent steps. The novel Resource then becomes current Resource and the individuals will reproduce with a net reproduction rate \(b\), limited to the carrying capacity K. The offspring’s phenotype (inherit from the parents, plus variation \(\delta\)) is assigned to each individual and calculated using the normal probability function: \[\begin{gathered} P (\delta) = \exp{[\frac{-\delta^2}{2\sigma_{mut}^2}]} , \end{gathered}\] where \(\delta\) is random phenotypic variation assigned to each individual; \(\sigma_{mut}\) is the standard deviation for mutation. The parameter \(\delta\) is randomly defined from a Gaussian distribution centered in zero and with a standard deviation \(\sigma_{mut}\). Offspring phenotypes are equal to the arithmetic average of their parent’s phenotypes plus \(\delta\) (i.e the extent of genetic novelty introduced with the reproduction). The descendants replace their parents and will populate generation n+1 that will start over with another dispersion event to a novel Resource.

The overview of the main steps of the model are summarized in Figure 1.