Our data set of 1,698 protein-encoding loci contained putative orthologous sequences from every one of the genomes of the 12 chosen mammal species plus the chicken and frog outgroups. The detection of orthologs among such a wide range of vertebrate taxa suggests this data set represents relatively slowly evolving DNA capable of revealing the ancient branching order of Placentalia. We found that Afrotheria and Xenarthra form a sister group (i.e., Atlantogenata) to a clade comprising Euarchontoglires and Laurasiatheria (i.e., Boreoeutheria) when MP, ML, or Bayesian approaches are used. The NJ approach also depicts an Atlantogenatan clade whether amino acid or nucleotide sequences are used to infer the tree; however, the NJ tree based on nucleotide sequences is alone in depicting a basal separation of Rodentia (murids) from all other placental mammals. The differences between this nucleotide NJ tree topology compared with MP/ML/Bayesian tree topology and also the amino acid NJ tree topology suggests methodological failure of at least one of these methods. In terms of either parsimony or likelihood criteria, the topology tests we conducted strongly reject the nucleotide NJ tree. The high bootstrap support for the MP/ML/Bayesian (Fig. 2) but also for the NJ nucleotide tree (Fig. 3) further indicates that one or the other tree-reconstruction approaches is inappropriate for the data and produces an incorrect tree because of systematic error (22–26).
The main biases that can cause systematic error in tree reconstruction are nucleotide compositional bias, long-branch attraction, and heterotachy (22, 24). Compositional bias has been shown to affect phylogeny reconstruction such that subsets of unrelated species that have converged on similar nucleotide compositions are grouped together erroneously. In a phylogenomic study of yeast orthologs (27), compositional bias was shown to lead to inconsistency in distance methods but not ML. In the present data set, compositional bias does not appear to be a problem, because the nucleotide compositions are similar among the taxa sampled. Especially noteworthy is this compositional similarity is manifested at each of the three codon positions (SI Table 3). Nonetheless, we removed third codon positions from parsimony analysis, because third positions are more likely the source of homoplastic substitutions that can cause long-branch attraction.
Long-branch attraction is a classic phylogenetic problem that incorrectly unites long branches together in a clade (28). It is a particular problem in MP, which fails to correct for parallel changes on long branches (25, 26). Long-branch attraction can cause systematic error in all methods used in this study, but there are data that suggest adding more taxa can break up long branches, reducing the probability of error (29). Interestingly, an NJ analysis of the Murphy et al. (9) data set composed of 44 mammalian taxa detected, as in the original study, an Afrotherian clade as sister to all other placental mammals (SI Text), rather than the rodent first separation detected by the nucleotide NJ analysis (all codon positions) performed in this study. The tree topology also shows a monophyletic Glires and Euarchontoglires. To so analyze the Murphy data set by NJ demonstrates long-branch attraction effect gets drastically reduced by denser taxon sampling (which breaks up the rodent long branch). Indeed, if the NJ analysis of the Murphy data set includes only the taxa represented in this study, a "rodent first" topology is once again recovered. The present study sampled more taxa than other recent mammalian phylogenomic studies (4, 6) and is therefore less likely to be affected by the long-branch attraction problem. This point is borne out by the finding that our data set recovered identical topologies to those reported by Huttley et al. (6) and Cannarrozzi et al. (4) when we limited our data to include only the taxa in those studies (see SI Text). Furthermore, the parsimony nucleotide results were obtained by using only first and second codon positions, which are less subject to the parallel changes that contribute to the long-branch attraction problem, and the topology obtained was identical to that recovered when MP was applied to the translated amino acid sequences (which also reduces the likelihood of erroneously identifying homoplasies as synapomorphies).
Variation in substitution rate at a single base or amino acid position over evolutionary time is referred to as heterotachy and can result in phylogenetic artifacts (30, 31). Errors resulting from heterotachy are difficult to detect with the methods we used; however, it has been shown that MP methods are sometimes less sensitive to heterotachy than are the probabilistic ML and Bayesian techniques (19), and in this study, results were congruent among these three methods. Nevertheless, further investigation of the cladistic relationships among mammals is now possible, because more eutherian genomes are available (e.g., platypus, cat, horse, bat, galago, treeshrew, and guinea pig). The availability of these genomes will allow further testing of the Atlantogenata/Boreoeutheria split and, within Boreoeutheria, the Glires hypothesis depicted in the majority of analyses in this study.
In our view, the weight of evidence now points to a sister group relationship between Atlantogenata and Boreoeutheria, and a clear scenario of biogeographic diversification emerges (Fig. 4). In this scenario, the placental mammals would have been subdivided into two lineages when the spreading Tethyan seaway widely separated Gondwana in the south from Laurasia in the north during the Cretaceous (32, 33). This process divided the initial members of the clades Boreoeutheria in the north from their southern atlantogenatan counterparts. Also, later in the Cretaceous, the disconnection of the African and South American landmasses 100 million years ago would have resulted in vicariance within Atlantogenata. This vicariant separation resulted in the clades Afrotheria in Africa and Xenarthra in South America. The mode of diversification between Laurasiatheria and Euarchontoglires remains murky, and it is unclear whether this was primarily because of vicariance between North America and Eurasia, some other vicariant event, or dispersal. Some remarkably similar morphological features that have emerged among the mammalian clades in the different geographic areas led previous workers to group divergent taxa together in polyphyletic assemblages (e.g., ungulates) based on convergent evolution of hoofs or to assume that features emerged only once (e.g., the variant types of gross anatomy in the placenta). With the availability of the mammalian genome sequences ever accumulating, it is now possible to design and test phylogenetic hypotheses about the genetic underpinnings of these and other important aspects of mammalian phenotypes.