Zebrafish strains and maintenance
All zebrafish used in this experiment were obtained from matings of
the Ab/Tü strain (Oregon, AB/Tübingen, Tü). Breeding and raising of
zebrafish followed standard protocols [49].
Data mining
All annotated fish and mammalian S100 sequences were extracted
either from the National Center for Biotechnology Information (NCBI) or
Ensembl database resources [50]
and used in the subsequent phylogenetic analysis, if they conformed to
the inclusion criteria detailed below. Representative fish genes from
each subclade and additionally, 19 human S100 genes (full set excluding
close relatives [38]) were used as query in tBLASTN searches in the NCBI nucleotide databases nr/nt, EST [51] and in the ENSEMBL database [50]. For elephant shark (Callorhinchus milii) only WGS traces were available, and for salmon (Salmo salar)
only the EST database was used. The analysis was repeated using newly
found subclades as query. To be considered as validated S100 genes, the
candidates needed to fulfill the following inclusion criteria: a)
position within the S100 clade in the phylogenetic analysis; b)
application of the BLASTP algorithm in the NCBI nonredundant database
should result in annotated S100 or some other S100 candidates as first
hits; c) presence of four helices separated by one S100 EF hand, one
hinge and one canonical S100 hand (region assignment according to [1])
within 80 to 100 consecutive amino acids, which is the extent observed
for this composite motif for all mouse and human S100 genes.
For Accession numbers see Additional File 2.
Phylogenetic analysis
MAFFT, version 5.8 [52], was employed for multiple protein alignments using the E-INS-i strategy
with the default parameters. To estimate the phylogenetic relationships
of the sequences we performed distance-based, maximum parsimony, and
maximum likelihood analyses using the Neighbour Joining (NJ), Protpars
(MP) and Proml (ML) programs as implemented in ClustalX [53] and PHYLIP [22], packages respectively. For the NJ method we performed bootstrapping with 10000 repetitions using ClustalX [53]
and for the MP and ML methods we performed bootstrapping with 100
repetitions using the program SEQBOOT from the PHYLIP package [22]. The three methods gave similar clustering. Consensus trees were obtained using the CONSENSE program of the PHYLIP package [22].
Subclades within the teleost S100 gene family were determined from
the tree as the largest clades that fulfilled two criteria: the clade
had >80% bootstrap support in the NJ analysis (S100Q is the only
exception with 71%) and is supported both in the MP and ML analysis.
Fourteen such subclades were identified, which correspond to groups of
orthologous genes.
Evolutionary distances
The evolutionary distances between amino acid sequences were calculated using MEGA4 [54].
All results were based on the pairwise analysis of the given number of
sequences per ortholog or paralog group. Analyses were conducted using
the Poisson correction method in MEGA4 [54,55].
All positions containing alignment gaps and missing data were
eliminated only in pairwise sequence comparisons (Pairwise deletion
option).
Sequence logos
Sequence logos were generated using a web-based program, Weblogo, version 2.8.2. developed by Crooks [56] and Schneider and Stevens [57,58].
A logo was generated with 85 teleost and cartilaginous fish S100 amino
acid sequences. Sequence alignments were manually edited using MEGA 4 [54]
and highly divergent pieces between the start codon and the beginning
of helix 1 were trimmed to avoid N-terminal length heterogeneity. This
did not affect significantly conserved residues. Gap positions present
in more than 85% of the sequences were deleted completely.
dN/dS Analysis
The global dN/dS ratios for the full length S100 coding sequences of
all five teleost species for which full genomic information is
available were determined using the HyPhy package on the datamonkey
server [59], which implements a previously published method [60]. The nucleotide alignment was manually edited and gap positions present in more than 85% of the sequences were removed.
To make inferences about selective pressure (positive and negative
selection) on individual codons (sites) within the S100 coding
sequences, the Single Likelihood Ancestor Counting (SLAC) package [61] was used, which implements the Suzuki-Gojobori method [60].
The algorithm is briefly outlined. First, a best-fitting nucleotide
substitution model was automatically selected by fitting several such
substitution models to both the data and a neighbor-joining tree
generated from the alignment described above. Taking the obtained
substitution rates and branch lengths as constant, a codon model was
employed to fit to the data and a global dN/dS ratio was calculated.
Then a codon by codon reconstruction of the ancestral sequences was
performed using maximum likelihood. Afterwards the expected normalized
(ES) and observed numbers (EN) of synonymous (NS) and non-synonymous
(NN) substitutions were calculated for each non-constant site. dN =
NN/EN and dS = NS/ES were then computed, and if dN < dS (negative
selection) or dN > dS (positive selection), a p-value derived from a
two-tailed extended binomial distribution was used to assess
significance. Tests on simulated data (S.L.K. Pond and S.D.W. Frost,
methods available at [61]
show that p values equal or smaller than 0.1 identify nearly all true
positives with a false positive rate generally below the nominal p
value; for actual data, the number of true positives at a given false
positive rate is lower. In the present study, two thresholds for
significance (0.1 and 0.2) were taken into account.
RT-PCR
Ten zebrafish (mix of male and female Danio rerio, strain
Ab/Tü) were dissected and several tissues were pooled for each RNA
extraction: barbels and lips, bone, brain, eyes, genitourinary, gills,
heart, liver, muscle, olfactory bulb, olfactory epithelium, skin. cDNA
was generated by using Superscript III reverse transcriptase
(Invitrogen) with an anchored oligo18(dT) reverse primer. PCR
amplifications were performed by using the following primer pairs, all
of them (except S100B) intron-spanning:
Dr_ actin (forward, CCCCATTGAGCACGGTATT; reverse, TCATGGAAGTCCACATGGCAGAAG), Dr S100A1 (forward, CTTCAAGGGGAACTCAGTGA; reverse, AAAACTCATTGCATGCCACA), Dr S100A10b (forward, CGCAGGACATTCACATCATT; reverse, TTTTCCCCTCATGTTTGGTC), Dr S100A10a (forward, ATTTCACTCAGTCGCCCAAA; reverse, ATGGACAAACCCAAGACCAA), Dr S100A11 (forward, TCAAGGCTTATGCTGGGAAG; reverse, TGCAACATTGCCAATCAGA), Dr S100B (forward, GAAAGTTTGGACACCGATGG; reverse, TGGCCATGTCTTGAAACAAA), Dr S100I.1 (forward, AGAACCACCATGGCTACGTC; reverse, TGCAAAGCATTGTGATACAGG), Dr S100I.2 (forward, TCATTGCAACCTTCCACAAA; reverse, ACAGGCGATCAATGTGATGT), Dr S100S (forward, TGCAGATGCTCATCAAGACC; reverse, GTCCAGGAAGAAGTCGTTGC), Dr S100T (forward, TGGGAATGAGGGTGACAAAT; reverse, TCATTCGCTGGTCATGTGTT), Dr S100Z
(forward, TAAACTGGAGGGAGCAATGG; reverse, TCCAGCACTCAGTTTACGAT). The
following conditions were used: 2 min at 96°C, followed by 35 cycles of
30 sec at 96°C, 30 sec at 60°C, and 60 sec at 72°C, and a final
extension of 10 min at 72°C. Regions chosen for PCR primers did not
exhibit any appreciable sequence identity to each other (with exception
of I2 primers, which may additionally recognize I1, but not vice
versa), thereby excluding cross-amplification. All PCR products were
cloned and sequenced using standard protocols. For sequences see
Additional File 6.
In Situ Hybridization
The templates for the probes were amplified from cloned fragments
obtained by RT-PCR using the previously described primers with the T3
promoter site (TATTAACCCTCACTAAAGGGAA) attached to their 5' end.
Digoxigenin (DIG) probes were synthesized according to the DIG RNA
labeling kit supplier protocol (Roche Molecular Biochemicals).
RNA in situ hybridization of S100 genes was carried out following the method of Thisse et al [62] as modified in [63].
Hybridizations were performed on 5 dpf old larvae overnight at 62°C.
Anti-DIG primary antibody coupled to alkaline phosphatase (Roche
Molecular Biochemicals) and NBT-BCIP (Roche Molecular Biochemicals)
were used for signal detection. Results were documented with a Nikon
CoolPix 950 digital camera attached to a Nikon SMZ-U binocular for
whole mount images. Cryosections of hybridized embryos, obtained by a
Leica CM1900 cryostat were documented on a Zeiss AxioVert microscope
and an attached Diagnostic Instruments Spot-RT camera.