A simple approach for maximizing the overlap of phylogenetic and comparative data

Matthew W. Pennell*, Richard G. FitzJohn, William K. Cornwell

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    33 Citations (Scopus)

    Abstract

    Biologists are increasingly using curated, public data sets to conduct phylogenetic comparative analyses. Unfortunately, there is often a mismatch between species for which there is phylogenetic data and those for which other data are available. As a result, researchers are commonly forced to either drop species from analyses entirely or else impute the missing data. A simple strategy to improve the overlap of phylogenetic and comparative data is to swap species in the tree that lack data with ‘phylogenetically equivalent’ species that have data. While this procedure is logically straightforward, it quickly becomes very challenging to do by hand. Here, we present algorithms that use topological and taxonomic information to maximize the number of swaps without altering the structure of the phylogeny. We have implemented our method in a new R package phyndr, which will allow researchers to apply our algorithm to empirical data sets. It is relatively efficient such that taxon swaps can be quickly computed, even for large trees. To facilitate the use of taxonomic knowledge, we created a separate data package taxonlookup; it contains a curated, versioned taxonomic lookup for land plants and is interoperable with phyndr. Emerging online data bases and statistical advances are making it possible for researchers to investigate evolutionary questions at unprecedented scales. However, in this effort species mismatch among data sources will increasingly be a problem; evolutionary informatics tools, such as phyndr and taxonlookup, can help alleviate this issue.

    Original languageEnglish
    Pages (from-to)751-758
    Number of pages8
    JournalMethods in Ecology and Evolution
    Volume7
    Issue number6
    DOIs
    Publication statusPublished - 1 Jun 2016

    Keywords

    • data imputation
    • evolutionary informatics
    • missing data
    • phylogenetic comparative methods
    • taxonomy

    Fingerprint

    Dive into the research topics of 'A simple approach for maximizing the overlap of phylogenetic and comparative data'. Together they form a unique fingerprint.

    Cite this