Sidestepping genealogical discordance:
allele sharing as a basis for species delimitation
Jean-François Flot
Species delimitation aims to delineate the “elementary particles” of biodiversity, i.e., species, and as such is a necessary prerequisite to studies in domains as diverse as physiology, ecology or population genetics. However, although there is now a general consensus that DNA-based approaches to species delimitation are often more reliable than morphology-based ones (as DNA, unlike morphology, does not display plasticity under the influence of the environment), there remains a profound disagreement about how to delineate species, and even how best to define them.
Three main types of approaches to species delimitation have been proposed. Distance-based approaches, such as DNA barcoding and ABGD (automatic barcode gap delimitation), rely on the postulate that genetic distances within species are smaller than interspecific distances; such approaches are the most frequently used to delineate microbial species, be it from complete genome sequences or by sequencing one or several representative genome regions (called “markers”). Tree-based approaches, such as generalized mixed Yule-coalescent (GMYC) models or Poisson tree processes (PTP), attempt to distinguish intraspecific vs. interspecific regions of phylogenetic trees under the assumption that branching rates are markedly higher within species-level clades than between them. Tree-based approaches are presently the most popular ones among botanists and zoologists because they are best suited to analyze the large datasets amassed since 2003 for fast-evolving regions of the mitochondrial (for animals) or chloroplastic (for plants) genomes thanks to the availability of near-universal PCR primers targeting these regions. Finally, a third type of approach to species delimitation uses allele sharing as a proxy for gene flow between individuals, and consider groups of individuals that share a common set of alleles (“gene pool”) as conspecific. These allele sharing-based approaches are closely related to the classical biological species criterion of interfecundity and were therefore initially thought to apply only to sexually recombining species; however, there is growing evidence that allele sharing is also informative on species boundaries for organisms that do not perform meiotic sex but exchange genes horizontally, such as bacteria and bdelloid rotifers. One additional advantage of allele sharing-based approaches to species delimitation is that they do not assume concordant genealogies among conspecific individuals: notably, they are not hindered by the species-level non-monophyly in gene trees that often result from incomplete lineage sorting or introgressive hybridization, and do not assume either that intraspecific distances are smaller than interspecific distances. Last, but not least, allele sharing-based approaches are very fast computationally (as they do not require all pairwise comparisons among sequences in a dataset to be computed and do not involve phylogenetic tree reconstruction). Hence, allele sharing has the potential to yield a one-size-fit-all, universal approach to species delimitation, bringing together the presently very disparate taxonomic practices of microbiologists, botanists and zoologists.