table of contents 
In general, the construction of trees is based on sequence alignments. This …
');
|
Biology Articles » Methods & Techniques » Comparing sequences without using alignments: application to HIV/SIV subtyping » Figures
Figures - Comparing sequences without using alignments: application to HIV/SIV subtyping
|
Figure 1
The neighbor-joining tree obtained from 70 HIV/SIV nucleotide sequences (distance matrix calculated by using the N-localdecoding method for N = 15).
The sequences names are written as follows: their GenBank accession
numbers, followed by their nomenclature names [9–11, 14, 15]. These
sequences can be retrieved from the Los Alamos HIV sequence database
[24]. Bootstrap values (≥ 90%) are indicated.
(Click image to enlarge)
|
|
Figure 2
The neighbor-joining tree obtained from 43 HIV/SIV non-coding parts of LTR nucleotide sequences (distance matrix calculated for N = 11).
M15390 corresponds to the HIV-2-A ROD isolate just as X05291 for Figure
1. Sequence names follow the same rule as in Figure 1. Bootstrap values
(≥ 50%) are indicated.
(Click image to enlarge)
|
|
Figure 3
The neighbor-joining tree calculated from the multiple
alignment of the same 43 sequences as in figure 2, produced by the
CLUSTAL-W program. Sequence names follow the same rule as in Figure 1.
(Click image to enlarge)
|
|
Figure 4
The neighbor-joining tree calculated from the multiple
alignment of the same 43 sequences as in Figure 2, produced by the
DIALIGN-2 program. Sequence names follow the same rule as in Figure 1.
(Click image to enlarge)
|
|
|
Figure 5
(Click image to enlarge)
|
|
Figure 6
The «local decoding of order N» computing strategy. Figure 6a (top): Four (N =
5)-related sites (containing the letter T) are taken from three input
nucleotide sequences seq1, seq2 and seq3. Each of the four boxed
sectors (2N - 1 = 9 letters in length) has T at its center (in
bold face type) and is identified by the sequence where it is situated
and the position of T in this sequence (that is seq1,11, seq2,5, seq3,5
and seq3,12, see Figure 6a bottom). Figure 6a (bottom):
each 9-letters-long segment (identified by the corresponding site
containing a T in bold face type) is displayed with the set of
corresponding overlapping (step 1) words of length N = 5
underneath the corresponding site (boxed). The four sites are
5-related; seq1,11, seq2,5 and seq3,12, are directly 5-related by TGGAC
(in bold face type) at the position 1; seq1,11 and seq3,12 are also
directly 5-related by CTGGA at the position 2; seq3,5 is directly
5-related with only seq2,5 by CACTT at the position 5, so that it is
connected by seq2,5 with the other two sites. Figure 6b:
the symbols that identify each class containing at least two sites, are
shown together with the segments covered by the overlapping 5-words
that lie over the letter (boxed). Figure 6c: the
re-written sequences generated by the program. The identifiers
corresponding to classes containing only one site are only represented
by their corresponding letter in the input sequence; in fact, they
cannot contribute to calculating the similarities between pairwise
compared re-written sequences. Figure 6d: the double-entry table for constructing a pairwise distance matrix between the three sequences (re-written in figure 6c).
Each class identifier with at least two sites is indicated in the
corresponding row. For each row and for each of the three sequences
that label the three columns, the table gives the number of sites of
this N-class that appear in the sequence. Figure 6e: similarity matrix and the corresponding normalized dissimilarity matrix (see text) for the three sequences.
(Click image to enlarge)
|
rating: 0.00 from 0 votes | updated on: 12 Aug 2009 | views: 4518 |
|