Login

Join for Free!
112262 members
table of contents table of contents

In general, the construction of trees is based on sequence alignments. This …


Biology Articles » Methods & Techniques » Comparing sequences without using alignments: application to HIV/SIV subtyping » Figures

Figures
- Comparing sequences without using alignments: application to HIV/SIV subtyping

mcith_081309f01.jpg Figure 1  The neighbor-joining tree obtained from 70 HIV/SIV nucleotide sequences (distance matrix calculated by using the N-localdecoding method for N = 15). The sequences names are written as follows: their GenBank accession numbers, followed by their nomenclature names [9–11, 14, 15]. These sequences can be retrieved from the Los Alamos HIV sequence database [24]. Bootstrap values (≥ 90%) are indicated.

(Click image to enlarge)

mcith_081309f02.jpg Figure 2 The neighbor-joining tree obtained from 43 HIV/SIV non-coding parts of LTR nucleotide sequences (distance matrix calculated for N = 11). M15390 corresponds to the HIV-2-A ROD isolate just as X05291 for Figure 1. Sequence names follow the same rule as in Figure 1. Bootstrap values (≥ 50%) are indicated.

(Click image to enlarge)

mcith_081309f03.jpg Figure 3 The neighbor-joining tree calculated from the multiple alignment of the same 43 sequences as in figure 2, produced by the CLUSTAL-W program. Sequence names follow the same rule as in Figure 1.

(Click image to enlarge)

mcith_081309f04.jpg Figure 4 The neighbor-joining tree calculated from the multiple alignment of the same 43 sequences as in Figure 2, produced by the DIALIGN-2 program. Sequence names follow the same rule as in Figure 1.

(Click image to enlarge)

  Figure 5  

(Click image to enlarge)

mcith_081309f06.jpg Figure 6 The «local decoding of order N» computing strategy. Figure 6a (top): Four (N = 5)-related sites (containing the letter T) are taken from three input nucleotide sequences seq1, seq2 and seq3. Each of the four boxed sectors (2N - 1 = 9 letters in length) has T at its center (in bold face type) and is identified by the sequence where it is situated and the position of T in this sequence (that is seq1,11, seq2,5, seq3,5 and seq3,12, see Figure 6a bottom). Figure 6a (bottom): each 9-letters-long segment (identified by the corresponding site containing a T in bold face type) is displayed with the set of corresponding overlapping (step 1) words of length N = 5 underneath the corresponding site (boxed). The four sites are 5-related; seq1,11, seq2,5 and seq3,12, are directly 5-related by TGGAC (in bold face type) at the position 1; seq1,11 and seq3,12 are also directly 5-related by CTGGA at the position 2; seq3,5 is directly 5-related with only seq2,5 by CACTT at the position 5, so that it is connected by seq2,5 with the other two sites. Figure 6b: the symbols that identify each class containing at least two sites, are shown together with the segments covered by the overlapping 5-words that lie over the letter (boxed). Figure 6c: the re-written sequences generated by the program. The identifiers corresponding to classes containing only one site are only represented by their corresponding letter in the input sequence; in fact, they cannot contribute to calculating the similarities between pairwise compared re-written sequences. Figure 6d: the double-entry table for constructing a pairwise distance matrix between the three sequences (re-written in figure 6c). Each class identifier with at least two sites is indicated in the corresponding row. For each row and for each of the three sequences that label the three columns, the table gives the number of sites of this N-class that appear in the sequence. Figure 6e: similarity matrix and the corresponding normalized dissimilarity matrix (see text) for the three sequences.

(Click image to enlarge)

 


rating: 0.00 from 0 votes | updated on: 12 Aug 2009 | views: 5945 |

Rate article:







excellent!bad…