TreeSearch: Morphological Phylogenetic Analysis in R

Martin R. Smith

doi:10.32614/RJ-2023-019

1 Introduction

Even in the phylogenomic era, morphological data make an important contribution to phylogenetic questions. Discrete phenotypic data improve the accuracy and resolution of phylogenetic reconstruction even when outnumbered by molecular characters, and are the only way to incorporate the unique perspective on historical events that fossil taxa provide (Wiens 2004; Wortley and Scotland 2006; Koch and Parry 2020; Asher and Smith 2022).

One challenge with morphological analysis is the treatment of inapplicable character states: for example, ‘tail colour’ cannot logically be ascribed either of the states ‘red’ or ‘blue’ in a taxon that lacks a tail (Maddison 1993). This situation can profoundly mislead phylogenetic analysis, and is not handled appropriately by any standard Markov model or parsimony method.

Solutions to this issue have recently been proposed (De Laet 2005; Brazeau et al. 2019; Tarasov 2019, 2022; Goloboff et al. 2021; Hopkins and St. John 2021). Where a single ‘principal’ character (e.g. ‘tail’) exhibits \(n\) ‘contingent’ characters (e.g. ‘tail colour’, ‘tail covering’), ‘exact’ solutions (Tarasov 2019, 2022; Goloboff et al. 2021) require the construction of multi-state hierarchies containing \(O(2^n)\) entries, meaning that analysis is only computationally tractable for simple hierarchies with few contingent characters. Moreover, these approaches cannot accommodate characters that are contingent on more than one principal character: for example, characters describing appendages on a differentiated head may be contingent on the presence of the two characters ‘appendages’ and ‘differentiated head’.

Such situations can be approached using the flexible parsimony approximation proposed by Brazeau et al. (2019). TreeSearch scores trees using the “Morphy” C implementation of this algorithm (Brazeau et al. 2017). Morphy implements tree search under equal step weights. TreeSearch additionally implements implied step weighting (Goloboff 1993), a method which consistently finds more accurate and precise trees than equal weights parsimony (Goloboff et al. 2008, 2018a; Smith 2019a).

There has been lively discussion as to whether, with the rise of probabilistic approaches, parsimony remains a useful tool for morphological phylogenetics (e.g. O’Reilly et al. 2016; Brown et al. 2017; Puttick et al. 2017; Goloboff et al. 2018b; Sansom et al. 2018). Notwithstanding scenarios that go beyond the limits of parsimony, such as the simultaneous incorporation of stratigraphic data and other prior knowledge (e.g. Guenser et al. 2021), neither parsimony nor probabilistic methods consistently recover ‘better’ trees when gains in accuracy are balanced against losses in precision (Smith 2019a). Even if probabilistic methods may eventually be improved through the creation of more sophisticated models that better reflect the nature of morphological data (Goloboff et al. 2018a; Tarasov 2019, 2022), parsimony analysis remains a useful tool – not only because treatments of inapplicable character states are presently available, but also because it facilitates a deeper understanding of the underpinning data by emphasizing the reciprocal relationship between a tree and the synapomorphies that it implies.

Whatever method is used to find phylogenetic trees, a single consensus tree may fail to convey all the signal in a set of phylogenetic results (Wilkinson 1994, 1996, 2003). A set of optimal trees can be better interpreted by examining consensus trees generated from clusters of similar trees (Stockham et al. 2002); by exploring tree space (Wright and Lloyd 2020; Smith 2022a) and by automatically identifying, annotating and removing ‘wildcard’ taxa (Smith 2022b) whose ‘rogue’ behaviour may reflect underlying character conflict or ambiguity (Kearney 2002). These methods are not always easy to integrate into phylogenetic workflows, so are not routinely included in empirical studies.

Figure 1: Flow charts summarizing key functions available in TreeSearch.

TreeSearch provides functions that allow researchers to engage with the three main aspects of morphological phylogenetic analysis: dataset construction and validation; phylogenetic search (including with inapplicable data); and the interrogation of optimal tree sets (Fig. 1). These functions can be accessed via the R command-line, as documented within the package and at ms609.github.io/TreeSearch, or through a graphical user interface (GUI). The GUI includes options to export a log of executed commands as a fully reproducible R script, and to save outputs in graphical, Nexus or Newick formats.

2 Implementation

Tree scoring

TreeSearch can score trees using equal weights, implied weighting (Goloboff 1993), or profile parsimony (Faith and Trueman 2001). The function TreeLength() calculates tree score using the “Morphy” phylogenetic library (Brazeau et al. 2017), which implements the Fitch (1971) and Brazeau et al. (2019) algorithms. Morphy returns the equal weights parsimony score of a tree against a given dataset. Implied weights and profile parsimony scores are computed by first making a separate call to Morphy for each character in turn, passed as a single-character dataset; then passing this value to the appropriate weighting formula and summing the total score over all characters.

Implied weighting (Goloboff 1993) is an approximate method that treats each additional step (i.e. transition between tokens) in a character as less surprising – and thus requiring less penalty – than the previous step. Each additional step demonstrates that a character is less reliable for phylogenetic inference, and thus more likely to contain additional homoplasy. The score of a tree under implied weighting is \(\sum{\frac{e_i}{e_i+k}}\), where \(e_i\) denotes the number of extra steps observed in character \(i\), and is derived by subtracting the minimum score that the character can obtain on any tree from the score observed on the tree in question (Goloboff 1993). The minimum length of a tree is one less than the number of unique tokens (excluding the inapplicable token ‘-’) that must be present.

Profile parsimony (Faith and Trueman 2001) represents an alternative formulation of how surprising each additional step in a character is (Arias and Miranda-Esquivel 2004): the penalty associated with each additional step in a character is a function of the probability that a character will fit at least as well as is observed on a uniformly selected tree. On this view, an additional step is less surprising if observed in a character where there are more opportunities to observe homoplasy, whether because a character contains fewer ambiguous codings (a motivation for the ‘extended’ implied weighting of Goloboff (2014)) or because states are distributed more evenly in a character, whose higher phylogenetic information content (Thorley et al. 1998) corresponds to a lower proportion of trees in which no additional steps are observed.

TreeSearch calculates the profile parsimony score by computing the logarithm of the number of trees onto which a character can be mapped using \(m\) steps, using theorem 1 of Carter et al. (1990). As computation for higher numbers of states (Maddison and Slatkin 1991) is more computationally complex, the present implementation is restricted to characters that contain two informative applicable states, and uses the Fitch (1971) algorithm.

Tree search

The TreeSearch GUI uses the routine MaximizeParsimony() to search for optimal trees using tree bisection and reconnection (TBR) searches and the parsimony ratchet (Nixon 1999). This goes beyond the heuristic tree search implementation in the R package phangorn (Schliep 2011) by using compiled C++ code to rearrange trees, dramatically accelerating computation, and thus increasing the scale of dataset that can be analysed in reasonable time; and in supporting TBR rearrangements, which explore larger neighbourhoods of tree space: TBR evaluates more trees than nearest-neighbour interchanges or subtree pruning and regrafting, leading to additional computational expense that is offset by a decreased likelihood that search will become trapped in a local optimum (Goeffon et al. 2008; Whelan and Money 2010).

By default, search begins from a greedy addition tree generated by function AdditionTree(), which queues taxa in a random order, then attaches each taxon in turn to the growing tree at the most parsimonious location. Search may also be started from neighbour-joining trees, or the results of a previous search.

Search commences by conducting TBR rearrangements – a hill-climbing approach that locates a locally optimal tree from which no tree accessible by a single TBR rearrangement has a better score. A TBR iteration breaks a randomly selected edge in the focal tree, and reconnects each possible pair of edges in the resultant sub-trees to produce a list of candidate trees. Entries that are inconsistent with user-specified topological constraints are removed; remaining trees are inserted into a queue and scored in a random sequence. If the score of a candidate tree is at least as good as the best yet encountered (within the bounds of an optional tolerance parameter \(\epsilon\), which allows the retention of almost-optimal trees in order to improve accuracy – see e.g. Smith (2019a)), this tree is used as the starting point for a new TBR iteration. Otherwise, the next tree in the list is considered. TBR search continues until the best score is found a specified number of times; a specified number of TBR break points have been evaluated without any improvement to tree score; or a set amount of time has passed.

When TBR search is complete, iterations of the parsimony ratchet (Nixon 1999) are conducted in order to search areas of tree space that are separated from the best tree yet found by ‘valleys’ that cannot be traversed by TBR rearrangements without passing through trees whose optimality score is below the threshold for acceptance. Each ratchet iteration begins by resampling the original matrix. A round of TBR search is conducted using this resampled matrix, and the tree thus produced is used as a starting point for a new round of TBR search using the original data. After a specified number of ratchet iterations, an optional final round of TBR search allows a denser sampling of optimal trees from the final region of tree space.

A simple example search can be conducted using a morphological dataset included in the package, taken from Vinther et al. (2008):

library("TreeSearch")
vinther <- inapplicable.phyData[["Vinther2008"]]
trees <- MaximizeParsimony(vinther, concavity = 10, tolerance = 0.05)

The MaximizeParsimony() command performs tree search under implied weights with a concavity value of 10 (concavity = Inf would select equal weights), retaining any tree whose score is within 0.05 of the best score.

The resulting trees can be summarised according to their scores (optionally, against a different dataset or under a different weighting strategy, as specified by concavity) and the iteration in which they were first hit:

TreeLength(trees, dataset = vinther, concavity = 10) |> 
  signif() |>         # truncate non-significant digits
  table()             # tabulate by score


1.52814 1.54329  1.5641 
      3      45       4

attr(trees, "firstHit")

  seed  start ratch1 ratch2 ratch3 ratch4 ratch5 ratch6 ratch7  final 
     0     29      4      0     10      7      2      0      0      0

More flexible, if less computationally efficient, tree searches can be conducted at the command line using the TreeSearch(), Ratchet() and Bootstrap() commands, which support custom tree optimality criteria (e.g. Hopkins and St. John 2021).

Visualization

The distribution of optimal trees, however obtained, can be visualized through interactive mappings of tree space (Hillis et al. 2005; Smith 2022a). The TreeSearch GUI supports the use of information theoretic distances (Smith 2020a); the quartet distance (Estabrook et al. 1985); or the Robinson–Foulds distance (Robinson and Foulds 1981) to construct tree spaces, which are mapped into 2–12 dimensions using principal coordinates analysis (Gower 1966). The degree to which a mapping faithfully depicts original tree-to-tree distances is measured using the product of the trustworthiness and continuity metrics (Venna and Kaski 2001; Kaski et al. 2003; Smith 2022a), a composite score denoting the degree to which points that are nearby when mapped are truly close neighbours (trustworthiness), and the degree to which nearby points remain nearby when mapped (continuity). Plotting the minimum spanning tree – the shortest path that connects all trees (Gower and Ross 1969) – can highlight stress in a mapping (grey lines in Fig. 2): the spatial relationships of trees are distorted in regions where the minimum spanning tree takes a circuitous route to connect trees that are mapped close to one another (see fig. 1a–b in Smith 2022a).

Figure 2: Three-dimensional map visualizing progress in a tree search in the TreeSearch GUI. Optimal trees belong to three statistically distinct clusters with good support (silhouette coefficient \(>\) 0.5), characterized by different relationships between certain taxa (plotting symbols). Although multiple ratchet iterations have visited each cluster, limited overlap between ratchet iterations suggests that a continuation of tree search may sample novel optimal trees. High trustworthiness and continuity values and a simple minimum spanning tree (grey) indicate that the mapping does not exhibit severe distortion. This figure depicts the tree space GUI display after loading the Wills et al. (2012) dataset; clearing previous trees from memory (sample n trees = 0); and starting a new search (Search→Configure) with equal step weighting and 10^1.5 max hits. 93 trees were sampled, coloured by “When first found”, with plotting symbols depicting “Relationships” between the specified taxa.

To relate the geometry of tree space to the underlying trees, each point in tree space may be annotated according to the optimality score of its corresponding tree under a selected step weighting scheme; by the relationships between chosen taxa that are inferred by that tree; and by the search iteration in which the tree was first found by tree search (Fig. 2).

Annotating trees by the iteration in which they were first found allows a user to evaluate whether a continuation of tree search is likely to yield more optimal trees. For example, if the retained trees were only recently found, the search may not yet have located a global optimum. Alternatively, if certain regions of tree space are visited only by a single ratchet iteration, it is possible that further isolated ‘islands’ (Bastert et al. 2002) remain to be found; continuing tree search until subsequent ratchet iterations no longer locate new clusters of trees will reduce the chance that optimal regions of tree space remain unvisited.

As the identification of clusters from mappings of tree space can be misleading (Smith 2022a), TreeSearch identifies clusters of trees from tree-to-tree distances using K-means++ clustering, partitioning around medoids and hierarchical clustering with minimax linkage (Hartigan and Wong 1979; Murtagh 1983; Arthur and Vassilvitskii 2007; Bien and Tibshirani 2011; Maechler et al. 2019). Clusterings are evaluated using the silhouette coefficient, a measure of the extent of overlap between clusters (Kaufman and Rousseeuw 1990). The clustering with the highest silhouette coefficient is depicted if the silhouette coefficient exceeds a user-specified threshold; the interpretation of the chosen threshold according to Kaufman and Rousseeuw (1990) is displayed to the user. Plotting a separate consensus tree for each cluster often reveals phylogenetic information that is concealed by polytomies in the single ‘plenary’ consensus of all optimal trees (Stockham et al. 2002).

Plenary consensus trees can also lack resolution because of wildcard or ‘rogue’ taxa, in which conflict or ambiguity in their character codings leads to an unsettled phylogenetic position (Wilkinson 1994, 2003; Kearney 2002). TreeSearch detects rogue taxa using a heuristic approach (Smith 2022b) that seeks to maximize the phylogenetic information content (sensu Thorley et al. 1998) of a consensus tree created after removing rogue taxa from input trees. The position of an excluded taxon is portrayed by shading each edge or node of the consensus according to the number of times the specified taxon occurs at that position on an underlying tree [Fig. 3; after Klopfstein and Spasojevic (2019)], equivalent to the ‘branch attachment frequency’ of Phyutility (Smith and Dunn 2008).

Identifying taxa with an unstable position, and splits with low support, can help an investigator to critically re-examine character codings; to this end, each edge of the resulting consensus can be annotated with the frequency of the split amongst the tree set, or with a concordance factor (Minh et al. 2020) denoting the strength of support from the underlying dataset.

TreeSearch GUI display showing plot of reduced consensus, coloured by leaf stability

Figure 3: Reduced consensus of 48 cladograms generated by analysis of data from Wills et al. (2012) under different parsimony methods by Brazeau et al. (2019), as displayed in the TreeSearch graphical user interface. Removal of taxa reveals strong support for relationships that would otherwise be masked by rogues such as Palaeoscolex, whose position in optimal trees is marked by the highlighted edges. The GUI state can be reproduced by selecting the options displayed in the figure.

Dataset review

Ultimately, the quality of a dataset plays a central role in determining the reliability of phylogenetic results, with changes to a relatively small number of character codings potentially exhibiting an outsized impact on reconstructed topologies (Goloboff and Sereno 2021). Nevertheless, dataset quality does not always receive commensurate attention (Simões et al. 2017). One step towards improving the rigour of morphological datasets is to annotate each cell in a dataset with an explicit justification for each taxon’s coding (Sereno 2009), which can be accomplished in Nexus-formatted data files (Maddison et al. 1997) using software such as MorphoBank (O’Leary and Kaufman 2011).

TreeSearch presents such annotations alongside a reconstruction of each character’s states on a specified tree, with inapplicable states mapped according to the algorithm of Brazeau et al. (2019). Neomorphic (presence/absence) and transformational characters (Sereno 2007) are distinguished by reserving the token 0 to solely denote the absence of a neomorphic character, with tokens 1 … n used to denote the \(n\) states of a transformational character (Brazeau et al. 2019). In order to identify character codings that contribute to taxon instability, each leaf is coloured according to its mean contribution to tree length for the visualized character (Pol and Escapa 2009).

This visualization of reconstructed character transitions can help to identify cases where the formulation of characters has unintended consequences (Wilkinson 1995; Brazeau 2011); where inapplicable states have been inconsistently applied (Brazeau et al. 2019); where taphonomic absence is wrongly coded as biological absence (Donoghue and Purnell 2009); where previous datasets are uncritically recycled (Jenner 2001); or where taxa are coded with more confidence than a critical evaluation of available evidence can truly support. Insofar as the optimal tree and the underlying characters are reciprocally illuminating (Mooi and Gill 2016), successive cycles of phylogenetic analysis and character re-formulation can improve the integrity of morphological datasets, and thus increase their capacity to yield meaningful phylogenetic results (Hennig 1966).

3 Availability

TreeSearch can be installed through the Comprehensive R Archive Network (CRAN) using install.packages("TreeSearch"); the graphical user interface is launched with the command TreeSearch::EasyTrees(). The package has been tested on Windows 10, Mac OS X 10 and Ubuntu 20, and requires only packages available from the CRAN repository. Source code is available at https://github.com/ms609/TreeSearch/, and is permanently archived at Zenodo (https://dx.doi.org/10.5281/zenodo.1042590). Online documentation is available at https://ms609.github.io/TreeSearch/.

4 Acknowledgements

I thank Alavya Dhungana and Joe Moysiuk for feedback on preliminary versions of the software, and Martin Brazeau and anonymous referees for comments on the manuscript. Functionality in TreeSearch employs the underlying R packages ape (Paradis and Schliep 2019), phangorn (Schliep 2011), Quartet (Sand et al. 2014; Smith 2019b), Rogue (Smith 2022b), shiny (Chang et al. 2021), shinyjs (Attali 2020), TreeDist (Smith 2020b), and TreeTools (Smith 2019c). Icons from R used under GPL-3; Font Awesome, CC-BY-4.0.

Supplementary materials

Supplementary materials are available in addition to this article. It can be downloaded at RJ-2023-019.zip

CRAN packages used

TreeSearch, phangorn, ape, Quartet, Rogue, shiny, shinyjs, TreeDist, TreeTools

CRAN Task Views implied by cited packages

Environmetrics, Phylogenetics, WebTechnologies

J. S. Arias and D. R. Miranda-Esquivel. Profile parsimony (PP): An analysis under implied weights (IW). Cladistics, 20(1): 56–63, 2004. DOI 10.1111/j.1096-0031.2003.00001.x.

D. Arthur and S. Vassilvitskii. K-means++: The advantages of careful seeding. In Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pages. 1027–1035 2007. USA: Society for Industrial and Applied Mathematics. URL https://dl.acm.org/doi/10.5555/1283383.1283494.

R. J. Asher and M. R. Smith. Phylogenetic signal and bias in paleontology. Systematic Biology, 71(4): 986–1008, 2022. DOI 10.1093/sysbio/syab072.

D. Attali. Shinyjs: Easily improve the user experience of your shiny apps in seconds. 2020. URL https://CRAN.R-project.org/package=shinyjs. R package version 2.0.0.

O. Bastert, D. Rockmore, P. F. Stadler and G. Tinhofer. Landscapes on spaces of trees. Applied Mathematics and Computation, 131(2-3): 439–459, 2002. DOI 10.1016/S0096-3003(01)00164-3.

J. Bien and R. Tibshirani. Hierarchical clustering with prototypes via minimax linkage. Journal of the American Statistical Association, 106(495): 1075–1084, 2011. DOI 10.1198/jasa.2011.tm10183.

M. D. Brazeau. Problematic character coding methods in morphology and their effects. Biological Journal of the Linnean Society, 104(3): 489–498, 2011. DOI 10.1111/j.1095-8312.2011.01755.x.

M. D. Brazeau, T. Guillerme and M. R. Smith. An algorithm for morphological phylogenetic analysis with inapplicable data. Systematic Biology, 68(4): 619–631, 2019. DOI 10.1093/sysbio/syy083.

M. D. Brazeau, M. R. Smith and T. Guillerme. MorphyLib: A library for phylogenetic analysis of categorical trait data with inapplicability. 2017. DOI 10.5281/zenodo.815372.

J. W. Brown, C. Parins-Fukuchi, G. W. Stull, O. M. Vargas and S. A. Smith. Bayesian and likelihood phylogenetic reconstructions of morphological traits are not discordant when taking uncertainty into consideration: A comment on Puttick et al. Proceedings of the Royal Society B: Biological Sciences, 284(1864): 20170986, 2017. DOI 10.1098/rspb.2017.0986.

M. Carter, M. Hendy, D. Penny, L. A. Székely and N. C. Wormald. On the distribution of lengths of evolutionary trees. SIAM Journal on Discrete Mathematics, 3(1): 38–47, 1990. DOI 10.1137/0403005.

W. Chang, J. Cheng, J. J. Allaire, C. Sievert, B. Schloerke, Y.-H. Xie, J. Allen, J. McPherson, A. Dipert and B. Borges. Shiny: Web application framework for R. 2021. URL https://CRAN.R-project.org/package=shiny. R package version 1.7.1.

J. E. De Laet. Parsimony and the problem of inapplicables in sequence data. Parsimony, Phylogeny, and Genomics, 81–116, 2005. DOI 10.1093/acprof:oso/9780199297306.003.0006.

P. C. J. Donoghue and M. A. Purnell. Distinguishing heat from light in debate over controversial fossils. BioEssays, 31(2): 178–189, 2009. DOI 10.1002/bies.200800128.

G. F. Estabrook, F. R. McMorris and C. A. Meacham. Comparison of undirected phylogenetic trees based on subtrees of four evolutionary units. Systematic Zoology, 34(2): 193–200, 1985. DOI 10.2307/sysbio/34.2.193.

D. P. Faith and J. W. H. Trueman. Towards an inclusive philosophy for phylogenetic inference. Systematic Biology, 50(3): 331–350, 2001. DOI 10.1080/10635150118627.

W. M. Fitch. Toward defining the course of evolution: Minimum change for a specific tree topology. Systematic Biology, 20(4): 406–416, 1971. DOI 10.1093/sysbio/20.4.406.

A. Goeffon, J.-M. Richer and Jin-Kao Hao. Progressive Tree Neighborhood applied to the Maximum Parsimony problem. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 5(1): 136–145, 2008. DOI 10.1109/TCBB.2007.1065.

P. A. Goloboff. Estimating character weights during tree search. Cladistics, 9(1): 83–91, 1993. DOI 10.1111/j.1096-0031.1993.tb00209.x.

P. A. Goloboff. Extended implied weighting. Cladistics, 30(3): 260–272, 2014. DOI 10.1111/cla.12047.

P. A. Goloboff, J. M. Carpenter, J. S. Arias and D. R. M. Esquivel. Weighting against homoplasy improves phylogenetic analysis of morphological data sets. Cladistics, 24(5): 758–773, 2008. DOI 10.1111/j.1096-0031.2008.00209.x.

P. A. Goloboff, J. D. Laet, D. Ríos-Tamayo and C. A. Szumik. A reconsideration of inapplicable characters, and an approximation with step-matrix recoding. Cladistics, 37(5): 596–629, 2021. DOI 10.1111/cla.12456.

P. A. Goloboff and P. C. Sereno. Comparative cladistics: Identifying the sources for differing phylogenetic results between competing morphology-based datasets. Journal of Systematic Palaeontology, 1–26, 2021. DOI 10.1080/14772019.2021.1970038.

P. A. Goloboff, A. Torres and J. S. Arias. Weighted parsimony outperforms other methods of phylogenetic inference under models appropriate for morphology. Cladistics, 34(4): 407–437, 2018a. DOI 10.1111/cla.12205.

P. A. Goloboff, A. Torres Galvis and J. S. Arias. Parsimony and model-based phylogenetic methods for morphological data: Comments on O’Reilly et al. Palaeontology, 61(4): 625–630, 2018b. DOI 10.1111/pala.12353.

J. C. Gower. Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika, 53(3/4): 325–338, 1966. DOI 10.2307/2333639.

J. C. Gower and G. J. S. Ross. Minimum spanning trees and single linkage cluster analysis. Journal of the Royal Statistical Society. Series C (Applied Statistics), 18(1): 54–64, 1969. DOI 10.2307/2346439.

P. Guenser, R. C. M. Warnock, W. Pett, P. C. J. Donoghue and E. Jarochowska. Does time matter in phylogeny? A perspective from the fossil record. bioR\(\chi\)iv, 2021. DOI 10.1101/2021.06.11.445746.

J. A. Hartigan and M. A. Wong. Algorithm AS 136: A K-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 28(1): 100–108, 1979. DOI 10.2307/2346830.

W. Hennig. Phylogenetic systematics. Urbana: The University of Illinois Press, 1966.

D. M. Hillis, T. A. Heath and K. St. John. Analysis and visualization of tree space. Systematic Biology, 54(3): 471–482, 2005. DOI 10.1080/10635150590946961.

M. J. Hopkins and K. St. John. Incorporating hierarchical characters into phylogenetic analysis. Systematic Biology, 70(6): 1163–1180, 2021. DOI 10.1093/sysbio/syab005.

R. A. Jenner. Bilaterian phylogeny and uncritical recycling of morphological data sets. Systematic Biology, 50(5): 730–742, 2001. DOI 10.1080/106351501753328857.

S. Kaski, J. Nikkilä, M. Oja, J. Venna, P. Törönen and E. Castrén. Trustworthiness and metrics in visualizing similarity of gene expression. BMC Bioinformatics, 4: 48, 2003. DOI 10.1186/1471-2105-4-48.

L. Kaufman and P. J. Rousseeuw. Partitioning around medoids (Program PAM). In Finding groups in data: An introduction to cluster analysis, pages. 68–125 1990. John Wiley & Sons, Ltd.

M. Kearney. Fragmentary taxa, missing data, and ambiguity: Mistaken assumptions and conclusions. Systematic Biology, 51(2): 369–381, 2002. DOI 10.1080/10635150252899824.

S. Klopfstein and T. Spasojevic. Illustrating phylogenetic placement of fossils using RoguePlots: An example from ichneumonid parasitoid wasps (Hymenoptera, Ichneumonidae) and an extensive morphological matrix. PLoS ONE, 14(4): e0212942, 2019. DOI 10.1371/journal.pone.0212942.

N. M. Koch and L. A. Parry. Death is on our side: Paleontological data drastically modify phylogenetic hypotheses. Systematic Biology, 69(6): 1052–1067, 2020. DOI 10.1093/sysbio/syaa023.

D. R. Maddison, D. L. Swofford and W. P. Maddison. NEXUS: An extensible file format for systematic information. Systematic Biology, 46(4): 590–621, 1997. DOI 10.1093/sysbio/46.4.590.

W. P. Maddison. Missing data versus missing characters in phylogenetic analysis. Systematic Biology, 42(4): 576–581, 1993. DOI 10.1093/sysbio/42.4.576.

W. P. Maddison and M. Slatkin. Null models for the number of evolutionary steps in a character on a phylogenetic tree. Evolution, 45(5): 1184–1197, 1991. DOI 10.1111/j.1558-5646.1991.tb04385.x.

M. Maechler, P. Rousseeuw, A. Struyf, M. Hubert and K. Hornik. Cluster: Cluster Analysis Basics and Extensions. Comprehensive R Archive Network, 2.1.0: 2019.

B. Q. Minh, M. W. Hahn and R. Lanfear. New methods to calculate concordance factors for phylogenomic datasets. Molecular Biology and Evolution, 37(9): 2727–2733, 2020. DOI 10.1093/molbev/msaa106.

R. D. Mooi and A. C. Gill. Hennig’s auxiliary principle and reciprocal illumination revisited. In The Future of Phylogenetic Systematics, Eds D. Williams, M. Schmitt and Q. Wheeler pages. 258–285 2016. Cambridge: Cambridge University Press. DOI 10.1017/CBO9781316338797.013.

F. Murtagh. A survey of recent advances in hierarchical clustering algorithms. The Computer Journal, 26(4): 354–359, 1983. DOI 10.1093/comjnl/26.4.354.

K. C. Nixon. The Parsimony Ratchet, a new method for rapid parsimony analysis. Cladistics, 15(4): 407–414, 1999. DOI 10.1111/j.1096-0031.1999.tb00277.x.

M. A. O’Leary and S. Kaufman. MorphoBank: Phylophenomics in the “cloud.” Cladistics, 27(5): 529–537, 2011. DOI 10.1111/j.1096-0031.2011.00355.x.

J. E. O’Reilly, M. N. Puttick, L. Parry, A. R. Tanner, J. E. Tarver, J. Fleming, D. Pisani and P. C. J. Donoghue. Bayesian methods outperform parsimony but at the expense of precision in the estimation of phylogeny from discrete morphological data. Biology Letters, 12(4): 20160081, 2016. DOI 10.1098/rsbl.2016.0081.

E. Paradis and K. Schliep. Ape 5.0: An environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics, 35(3): 526–528, 2019. DOI 10.1093/bioinformatics/bty633.

D. Pol and I. H. Escapa. Unstable taxa in cladistic analysis: Identification and the assessment of relevant characters. Cladistics, 25(5): 515–527, 2009. DOI 10.1111/j.1096-0031.2009.00258.x.

M. N. Puttick, J. E. O’Reilly, A. R. Tanner, J. F. Fleming, J. Clark, L. Holloway, J. Lozano-Fernandez, L. A. Parry, J. E. Tarver, D. Pisani, et al. Uncertain-tree: Discriminating among competing approaches to the phylogenetic analysis of phenotype data. Proceedings of the Royal Society B: Biological Sciences, 284(1846): 20162290, 2017. DOI 10.1098/rspb.2016.2290.

D. F. Robinson and L. R. Foulds. Comparison of phylogenetic trees. Mathematical Biosciences, 53(1-2): 131–147, 1981. DOI 10.1016/0025-5564(81)90043-2.

A. Sand, M. K. Holt, J. Johansen, G. S. Brodal, T. Mailund and C. N. S. Pedersen. tqDist: A library for computing the quartet and triplet distances between binary or general trees. Bioinformatics, 30(14): 2079–2080, 2014. DOI 10.1093/bioinformatics/btu157.

R. S. Sansom, P. G. Choate, J. N. Keating and E. Randle. Parsimony, not Bayesian analysis, recovers more stratigraphically congruent phylogenetic trees. Biology Letters, 14(6): 20180263, 2018. DOI 10.1098/rsbl.2018.0263.

K. P. Schliep. Phangorn: Phylogenetic analysis in R. Bioinformatics, 27(4): 592–593, 2011. DOI 10.1093/bioinformatics/btq706.

P. C. Sereno. Comparative cladistics. Cladistics, 25(6): 624–659, 2009. DOI 10.1111/j.1096-0031.2009.00265.x.

P. C. Sereno. Logical basis for morphological characters in phylogenetics. Cladistics, 23(6): 565–587, 2007. DOI 10.1111/j.1096-0031.2007.00161.x.

T. R. Simões, M. W. Caldwell, A. Palci and R. L. Nydam. Giant taxon-character matrices: Quality of character constructions remains critical regardless of size. Cladistics, 33(2): 198–219, 2017. DOI 10.1111/cla.12163.

M. R. Smith. Bayesian and parsimony approaches reconstruct informative trees from simulated morphological datasets. Biology Letters, 15(2): 20180632, 2019a. DOI 10.1098/rsbl.2018.0632.

M. R. Smith. Information theoretic Generalized Robinson–Foulds metrics for comparing phylogenetic trees. Bioinformatics, 36(20): 5007–5013, 2020a. DOI 10.1093/bioinformatics/btaa614.

M. R. Smith. Quartet: Comparison of phylogenetic trees using quartet and bipartition measures. Comprehensive R Archive Network, doi:10.5281/zenodo.2536318, 2019b. DOI 10.5281/zenodo.2536318.

M. R. Smith. Robust analysis of phylogenetic tree space. Systematic Biology, 71(5): 1255–1270, 2022a. DOI 10.1093/sysbio/syab100.

M. R. Smith. TreeDist: Calculate and map distances between phylogenetic trees. Comprehensive R Archive Network, doi:10.5281/zenodo.3528123, 2020b. DOI 10.5281/zenodo.3528123.

M. R. Smith. TreeTools: Create, modify and analyse phylogenetic trees. Comprehensive R Archive Network, doi:10.5281/zenodo.3522725, 2019c. DOI 10.5281/zenodo.3522725.

M. R. Smith. Using information theory to detect rogue taxa and improve consensus trees. Systematic Biology, 71(5): 1088–1094, 2022b. DOI 10.1093/sysbio/syab099.

S. A. Smith and C. W. Dunn. Phyutility: A phyloinformatics tool for trees, alignments and molecular data. Bioinformatics, 24(5): 715–716, 2008. DOI 10.1093/bioinformatics/btm619.

C. Stockham, L.-S. Wang and T. Warnow. Statistically based postprocessing of phylogenetic analysis by clustering. Bioinformatics, 18(Suppl 1): S285–S293, 2002. DOI 10.1093/bioinformatics/18.suppl_1.S285.

S. Tarasov. Integration of anatomy ontologies and evo-devo using structured Markov models suggests a new framework for modeling discrete phenotypic traits. Systematic Biology, 68(5): 698–716, 2019. DOI 10.1093/sysbio/syz005.

S. Tarasov. New phylogenetic Markov models for inapplicable morphological characters. bioR\(\chi\)iv, 2021.04.26.441495, 2022. DOI 10.1101/2021.04.26.441495.

J. L. Thorley, M. Wilkinson and M. Charleston. The information content of consensus trees. In Advances in Data Science and Classification, Eds A. Rizzi, M. Vichi and H.-H. Bock pages. 91–98 1998. Berlin: Springer. DOI 10.1007/978-3-642-72253-0_12.

J. Venna and S. Kaski. Neighborhood preservation in nonlinear projection methods: An experimental study. In Artificial Neural Networks ICANN 2001, Eds G. Dorffner, H. Bischof and K. Hornik pages. 485–491 2001. Berlin, Heidelberg: Springer. DOI 10.1007/3-540-44668-0_68.

J. Vinther, P. Van Roy and D. E. G. Briggs. Machaeridians are Palaeozoic armoured annelids. Nature, 451(7175): 185–188, 2008. DOI 10.1038/nature06474.

S. Whelan and D. Money. The prevalence of multifurcations in tree-space and their implications for tree-search. Molecular Biology and Evolution, 27(12): 2674–2677, 2010. DOI 10.1093/molbev/msq163.

J. J. Wiens. The role of morphological data in phylogeny reconstruction. Systematic Biology, 53(4): 653–661, 2004. DOI 10.1080/10635150490472959.

M. Wilkinson. A comparison of two methods of character construction. Cladistics, 11(3): 297–308, 1995. DOI 10.1111/j.1096-0031.1995.tb00091.x.

M. Wilkinson. Common cladistic information and its consensus representation: Reduced Adams and reduced cladistic consensus trees and profiles. Systematic Biology, 43(3): 343–368, 1994. DOI 10.2307/2413673.

M. Wilkinson. Majority-rule reduced consensus trees and their use in bootstrapping. Molecular Biology and Evolution, 13(3): 437–444, 1996. DOI 10.1093/oxfordjournals.molbev.a025604.

M. Wilkinson. Missing entries and multiple trees: Instability, relationships, and support in parsimony analysis. Journal of Vertebrate Paleontology, 23(2): 311–323, 2003. DOI 10.1671/0272-4634(2003)023[0311:MEAMTI]2.0.CO;2.

M. A. Wills, S. Gerber, M. Ruta and M. Hughes. The disparity of priapulid, archaeopriapulid and palaeoscolecid worms in the light of new data. Journal of Evolutionary Biology, 25(10): 2056–2076, 2012. DOI 10.1111/j.1420-9101.2012.02586.x.

A. H. Wortley and R. W. Scotland. The effect of combining molecular and morphological data in published phylogenetic analyses. Systematic Biology, 55(4): 677–685, 2006. DOI 10.1080/10635150600899798.

A. M. Wright and G. T. Lloyd. Bayesian analyses in phylogenetic palaeontology: Interpreting the posterior sample. Palaeontology, 63(6): 997–1006, 2020. DOI 10.1111/pala.12500.