table of contents
In general, the construction of trees is based on sequence alignments. This …

Biology Articles » Methods & Techniques » Comparing sequences without using alignments: application to HIV/SIV subtyping » Figures
Figures
 Comparing sequences without using alignments: application to HIV/SIV subtyping

Figure 1
The neighborjoining tree obtained from 70 HIV/SIV nucleotide sequences (distance matrix calculated by using the Nlocaldecoding method for N = 15).
The sequences names are written as follows: their GenBank accession
numbers, followed by their nomenclature names [9–11, 14, 15]. These
sequences can be retrieved from the Los Alamos HIV sequence database
[24]. Bootstrap values (≥ 90%) are indicated.
(Click image to enlarge)


Figure 2
The neighborjoining tree obtained from 43 HIV/SIV noncoding parts of LTR nucleotide sequences (distance matrix calculated for N = 11).
M15390 corresponds to the HIV2A ROD isolate just as X05291 for Figure
1. Sequence names follow the same rule as in Figure 1. Bootstrap values
(≥ 50%) are indicated.
(Click image to enlarge)


Figure 3
The neighborjoining tree calculated from the multiple
alignment of the same 43 sequences as in figure 2, produced by the
CLUSTALW program. Sequence names follow the same rule as in Figure 1.
(Click image to enlarge)


Figure 4
The neighborjoining tree calculated from the multiple
alignment of the same 43 sequences as in Figure 2, produced by the
DIALIGN2 program. Sequence names follow the same rule as in Figure 1.
(Click image to enlarge)


Figure 5
(Click image to enlarge)


Figure 6
The «local decoding of order N» computing strategy. Figure 6a (top): Four (N =
5)related sites (containing the letter T) are taken from three input
nucleotide sequences seq1, seq2 and seq3. Each of the four boxed
sectors (2N  1 = 9 letters in length) has T at its center (in
bold face type) and is identified by the sequence where it is situated
and the position of T in this sequence (that is seq1,11, seq2,5, seq3,5
and seq3,12, see Figure 6a bottom). Figure 6a (bottom):
each 9letterslong segment (identified by the corresponding site
containing a T in bold face type) is displayed with the set of
corresponding overlapping (step 1) words of length N = 5
underneath the corresponding site (boxed). The four sites are
5related; seq1,11, seq2,5 and seq3,12, are directly 5related by TGGAC
(in bold face type) at the position 1; seq1,11 and seq3,12 are also
directly 5related by CTGGA at the position 2; seq3,5 is directly
5related with only seq2,5 by CACTT at the position 5, so that it is
connected by seq2,5 with the other two sites. Figure 6b:
the symbols that identify each class containing at least two sites, are
shown together with the segments covered by the overlapping 5words
that lie over the letter (boxed). Figure 6c: the
rewritten sequences generated by the program. The identifiers
corresponding to classes containing only one site are only represented
by their corresponding letter in the input sequence; in fact, they
cannot contribute to calculating the similarities between pairwise
compared rewritten sequences. Figure 6d: the doubleentry table for constructing a pairwise distance matrix between the three sequences (rewritten in figure 6c).
Each class identifier with at least two sites is indicated in the
corresponding row. For each row and for each of the three sequences
that label the three columns, the table gives the number of sites of
this Nclass that appear in the sequence. Figure 6e: similarity matrix and the corresponding normalized dissimilarity matrix (see text) for the three sequences.
(Click image to enlarge)

rating: 0.00 from 0 votes  updated on: 12 Aug 2009  views: 7648 
