Login

Join for Free!
66398 members
table of contents table of contents

A genome-comparison strategy to identifying nuclear gene markers for phylogenetic inference and …


');

Biology Articles » Zoology » Ichthyology » A practical approach to phylogenomics: the phylogeny of ray-finned fish (Actinopterygii) as a case study » Discussion

Discussion
- A practical approach to phylogenomics: the phylogeny of ray-finned fish (Actinopterygii) as a case study

The bioinformatic approach implemented in this study resulted in a large set (154 loci for the zebrafish and torafugu comparison) of candidate genes to infer high-level phylogeny of ray-finned fishes. The actual number of candidate loci depended on the genomes being compared and the fixed search parameters. Experimental tests of a smaller subset (15 loci) demonstrate that a large fraction (2/3) of these candidates are easily amplified by PCR from whole genomic DNA extractions in a vast diversity of fish taxa. The assumption that these loci are represented by a single copy in the fish genomes could not be rejected by the PCR assays in the species tested (all amplifications resulted in a single product), increasing the likelihood that the genetic markers are orthologous and suitable to infer organismal phylogeny. Our method is based on searching, under specific criteria, the available complete genomic databases of organisms closely related to the taxa of interest. Therefore, the same approach that is shown to be successful for fishes could be applied to other groups of organisms for which two or more complete genome sequences exist. Parameter values (L, S, and C) used for the search (Figure 2) may be altered to obtain fragments of different size or with different levels of conservation (i.e., less conserved for phylogenies of more closely related organisms).

An alternative way to develop nuclear gene markers for phylogenetic studies is to construct a cDNA library or sequence several ESTs for a small pilot group of taxa, and then to design specific PCR primers to amplify the orthologous gene copy in all the other taxa of interest [19,46]. The major potential problem with this approach stems from the fact that the method starts with a cDNA library or a set of EST sequences, with no prior knowledge of how many copies a gene has in each genome. As discussed above, this condition may lead to mistaken paralogy. In our approach, we search the genomic database to find single-copy candidates so no duplicate gene copies, if present, would be missed (see below).

Recent studies have proposed whole genome duplication events during vertebrate evolution and also genome duplications restricted to ray-finned fishes [31,32,47,48]. Our results indicate that many single-copy genes still exist in a wide diversity of fish taxa (representing 28 orders of actinopterygian fishes), in agreement with previous estimates that a vast majority of duplicated genes are secondarily lost [34,35]. All 154 candidates were identified as single-copy genes in D. rerio and T. rubripes, according to our search criteria. Our results also show the 154 candidate genes are randomly distributed in the fish genome (at least among chromosomes of D. rerio). In the experimental tests, 10 out of 15 markers were found in single-copy condition in all successful amplifications, including the tetraploid species, O. mykiss. However, relaxing the search criteria, and conserving targets less than 50% similar in a subsequent blast search against the zebrafish genome, 7 of the 10 genes were found to have "alignable paralogs" (the 3 exceptions were myh6, tbr1, and Gylt). Genomes of medaka, stickleback, and fugu were also checked for these 3 genes, and no "paralogs" were detected, suggesting the sequences of ray-finned fish collected for these 3 genes are unambiguously orthologous to each other. Phylogenetic analyses for each of the 7 genes that include the putative paralogs found by this procedure produced tree topologies that strongly suggest an ancient duplication event in the vertebrate lineage, before the divergence of tetrapods from ray-finned fishes. Paralogous sequences are placed at the base of the tetrapod-actinopteryigian divergence, or as part of a basal polytomy with the other tetrapod and ray-finned fish sequences. In the terminology proposed by Remm et al. [49] these would be considered out-paralogs. In no case are these sequences nested among ingroup actinopterygian sequences (see Additional file 4), as would be the case expected for in-paralogs [49]. Stringent search critera implemented in our approach followed by phylogenetic analysis can distinguish between orthologs and putative our-paralogs. Although the method will not guarantee that single copy genes amplified by PCR in several taxa are orthologs as opposed to in-paralogs, the existence and identification of genome-scale single-copy nuclear markers should facilitate the construction of the tree of life, even if the evolutionary mechanism responsible for maintaining single-copy genes is poorly known [33].

Additional file 4. ML phylogenies based on protein sequences of individual genes and their out-paralogs found by relaxing our search criteria to include fragments with similarity < 50%.

Format: PDF Size: 9KB Download file

The molecular evolutionary profiles of the 10 newly developed markers are in the same range as RAG-1, a widely-used gene marker in vertebrates. The genes with high treeness values have intermediate substitution rate, suggesting that optimal rate and base composition stationarity are important factors that determine the suitability of a phylogenetic marker. The phylogeny based on individual markers revealed incongruent phylogenetic signal among 6 of the 10 individual genes. This incongruence suggests that significant biases in the data might obscure the true phylogenetic signal in some individual genes, but the direction of the bias is hardly shared among genes (Additional file 3), justifying the use of genome-scale gene makers to infer organismal phylogeny.

Finally, with respect to the phylogenetic results per se, there are two significant areas of discrepancy between the phylogeny obtained in this study (Figure 3a) and a consensus view of fish phylogeny (Figure 3b) [50]. Although these differences could be due to poor taxonomic sampling, we discuss them briefly. First, the traditional tree groups cichlids with other perciforms, whereas our results showed the cichlid O. niloticus is more closely related to atherinomorphs (Cyprinodontiformes + Beloniformes) than to other perciforms. This result also was supported by two recent studies analysing multiple nuclear genes [17,51]. The second difference is that the traditional tree groups Lycodes with other perciforms, while Lycodes was found closely related to Gasterosteus (Gasterosteiformes) in our results. Interestingly, the sister-taxa relationship between Lycodes and Gasterosteus also is supported by recent studies using mitochondrial genome data [38,52]. The difference between our "total evidence" tree and the classical hypothesis is significant based on the new data, as indicated by a one-tailed Shimodaira-Hasegawa (SH) test (p = 0.000) [53].


rating: 0.00 from 0 votes | updated on: 2 Jul 2008 | views: 3436 |

Rate article:







excellent!bad…