- Mitogenomic evaluation of the historical biogeography of cichlids toward reliable dating of teleostean divergences

Taxonomic sampling

Cichlid samples were obtained from local animal dealers in Japan. We combined these new mitogenomic data with 48 previously published sequences from the DDBJ/EMBL/GenBank nucleotide sequence database. The 10 cichlid taxa that we analyzed (Table 1) cover species from major Gondwana-origin landmasses. In addition, we chose 31 other teleosts, nine basal actinopterygians, and two sarcopterygians. Two sharks were sampled as an outgroup to root the tree. Additional file 1 contains a complete list of the sampled taxa, along with the database accession numbers of their mitogenomic sequences.

Additional File 1. List of species used, with database accession numbers. Classifications follow Nelson [11].

Format: DOC Size: 54KB Download file

Table 1. Cichlid taxa analyzed for mtDNAs

DNA extraction, PCR, and sequencing

Fish samples were excised from live or dead specimens of each species and immediately preserved in 99.5% ethanol. Total genomic DNA was extracted from muscle, liver, and/or fin clips using a DNeasy tissue kit (Qiagen) or a DNAzol Reagent (Invitrogen), following manufacturer protocols. The mtDNA of each species was amplified using a long-PCR technique with LA-Taq (Takara). Seven fish-versatile primers for long PCR (S-LA-16S-L, L2508-16S, L12321-Leu, H12293-Leu, H15149-CYB, H1065-12S, and S-LA-16S-H [21-26]) and the two cichlid-specific primers cichlid-LA-16SH (5'-TTGCGCTACCTTTGCACGGTCAAAATACCG-3') and cichlid-LA-16SL (5'-CGGAGTAATCCAGGTCAGTTTCTATCTATG-3') were used in various combinations to amplify regions covering the entire mtDNA in one or two reactions. The long-PCR products were used as templates for subsequent short PCR.

Over 100 fish-versatile PCR primers [21-27] and 18 taxon-specific primers (Additional file 2) were used in various combinations to amplify contiguous, overlapping segments of the entire mtDNA for each of the six new cichlid species. The long PCR and subsequent short PCRs were performed as described previously [21,28]. The short-PCR reactions were performed using the GeneAmp PCR System 9700 (Applied Biosystems) and Ex Taq DNA polymerase (Takara).

Additional File 2. Cichlid-specific primers for PCR and sequencing. H and L indicate the orientation of the primers. The locations of the primers are shown with the names of the targeted genes.

Format: DOC Size: 44KB Download file

Double-stranded PCR products, treated with ExoSAP-IT (USB) to inactivate remaining primers and dNTPs, were directly used for the cycle sequencing reaction, using dye-labeled terminators (Applied Biosystems) with amplification primers and appropriate internal primers. Labeled fragments were analyzed on Model 3100 and Model 377 DNA sequencers (Applied Biosystems).

Sequence manipulation

The DNA sequences obtained were edited and analyzed using EditView 1.0.1, AutoAssembler 2.1 (Applied Biosystems) and DNASIS 3.2 (Hitachi Software Engineering Co. Ltd.). Individual gene sequences were identified and aligned with their counterparts in 48 previously published mitogenomes. Amino acid sequences were used to align protein-coding genes, and standard secondary structure models for vertebrate mitochondrial tRNAs [29] were consulted for the alignment of tRNA genes. The 12S and 16S rRNA sequences were initially aligned using clustalX v. 1.83 [30] with default gap penalties and subsequently adjusted by eye using MacClade 4.08 [31].

The ND6 gene was excluded from the phylogenetic analyses because of its heterogeneous base composition and consistently poor phylogenetic performance [22]. The control region was also excluded because positional homology was not confidently established among such distantly-related species. The third codon positions of protein genes were excluded because of their extremely accelerated rates of change that may cause high levels of homoplasy. After the exclusion of unalignable parts in the loop regions of tRNA genes, as well as the 5' and/or 3' end regions of protein genes, all gene sequences were concatenated to produce 10,034-bp sites (6962, 1402, and 1670 positions for protein-coding, tRNA, and rRNA genes, respectively) for phylogenetic analyses.

Phylogenetic analyses

Phylogenetic trees were reconstructed using partitioned Bayesian and maximum likelihood analyses. Partitioned Bayesian phylogenetic analyses were performed using MrBayes 3.1.2 [32]. We set four partitions (first codon, second codon, tRNA, and rRNA positions). The general time-reversible model, with some sites assumed to be invariable and variable sites assumed to follow a discrete gamma distribution (GTR + I + Γ; [33]), was selected as the best-fit model of nucleotide substitution by MrModeltest 2.2 webcite[34]. The Markov chain Monte Carlo (MCMC) process was set so that four chains (three heated and one cold) ran simultaneously. We ran the program for 3,000,000 metropolis-coupled MCMC generations on each analysis, with tree sampling every 100 generations and burn-in after 10,000 trees.

Partitioned maximum likelihood (ML) analyses were performed with RAxML ver. 7.0.3 [35], a program implementing a novel, rapid-hill-climbing algorithm. For each dataset, a rapid bootstrap analysis and search for the best-scoring ML tree were conducted in one single program run, with the GTR + I + Γ nucleotide substitution model. The rapid bootstrap analyses were conducted with 1000 replications, with four threads running in parallel.

Statistical evaluation of alternative phylogenetic hypotheses was done using TREE- PUZZLE 5.2 [36], using the two-sided Kishino and Hasegawa (KH) [37] test, the Shimodaira and Hasegawa (SH) [38] test, and Bayes factors [39,40]. We used the GTR + I + Γ model and its parameters optimized by MrModeltest 2.2.

Divergence time estimation

For the divergence time estimation, multidistribute program [41] was used by assuming a topological relationship thus obtained, but without assuming the molecular clock (i.e., by allowing heterogeneity in molecular evolutionary rate along branches). Upper and/or lower time constraints at selected nodes were set for the Bayesian MCMC processes to estimate divergence times (including means and 95% credibility ranges) and relative rates at ingroup nodes. We set the partitioning as described above and first used PAML [42] to optimize the parameters of model F84 and the gamma distribution for eight categories to account for site heterogeneity. Estbranches and multidivtime programs were then used to estimate divergence times. We used 21 fossil-based time constraints assignable to diverse teleostean lineages (Table 2).

Table 2. Maximum (U) and minimum (L) time constrains (MYA) used for dating at nodes in Fig. 2

rating: 0.00 from 0 votes | updated on: 23 Oct 2008 | views: 17011 |

Rate article:

excellent! bad…