Discordance among gene trees can have important consequences for our understanding of the evolution of key traits in organisms of ecological or economic importance. For example, the cultivation of peppers (Capsicum spp., Solanaceae) is a global, multi-billion-dollar industry, yet our understanding of the evolutionary origin, diversification, and significance of their pungency has been confounded by genes with either poor resolution or discordant evolutionary histories. The genus Capsicum includes species with extremely pungent, variable, or non-pungent fruits. The closest relatives of the peppers belong to the genus Lycianthes, which has non-pungent fruits. Our previous work, based on a concatenation of two chloroplast and two nuclear gene regions, suggests that Lycianthes is paraphyletic in relation to Capsicum, with some members of Lycianthes being more closely related to Capsicum than to other Lycianthes. This backbone phylogeny, however, has extremely short branch lengths and is only weakly supported. This weak support is due to insufficient data for resolving a rapid and recent radiation and to discordance among individual genes. Thus, with our current dataset we are limited in our ability to investigate when or where pungency originated, how it has changed through time and space, or its ecological significance.
In this study, we sequenced the nuclear and plastid transcriptomes of four species of Capsicum and Lycianthes that span the major clades from our four-gene phylogeny. Our primary objectives were to identify genome-scale patterns of discordance among gene trees, to determine the underlying causes of this discordance, and to estimate a representative species tree. We developed a bioinformatic pipeline to identify homologous gene regions shared among our new transcriptome assemblies, pull data from genomes of additional species from online repositories, align the genes, filter the alignments based on set quality control thresholds, identify the best-fit model of molecular evolution for each alignment, construct phylogenies under Bayesian Inference, and compare the resulting topologies. We then mapped the genes onto the twelve chromosomes of a complete Capsicum annuum genome to visualize the genomic structure of gene history, and to determine if the physical proximity of two genes on a chromosome can predict their level of discordance. We constructed species trees using both Bayesian concordance coalescent methods.
We found 24 distinct topologies among the eight species and 890 nuclear genes included in our analysis, with each topology supported by a minimum of six genes. Nearly 300 genes supported a topology that is identical to our previous 4-gene tree, which placed Lycianthes biflora sister to a clade of Lycianthes and a monophyletic Capsicum. Over 100 nuclear genes placed Lycianthes biflora as sister to Capsicum but not the remaining Lycianthes, while the plastid genome and 82 nuclear genes supported reciprocally monophyletic genera. All three possible topologies among the three pungent peppers included in this study were supported to varying degrees, the most common of which was recovered by 539 genes. Both Bayesian concordance and coalescent approaches support the same species tree, placing Lycianthes biflora sister to all remaining taxa. We found no apparent pattern between the physical position of genes on the Capsicum annuum genome and their evolutionary history, as the 24 topologies were supported by genes scattered within and among each of the pepper’s 12 chromosomes. Within each chromosome, we found no significant relationship between the physical distance between genes and their level of discordance. To further explore these issues, we have used our newly sequenced transcriptomes to design baits to target over 2401 genes, which we are currently using to expand our dataset to include all members of Capsicum and Lycianthes.