table of contents table of contents

The MPI Bioinformatics Toolkit is an interactive web service which offers access …

Home » Biology Articles » Bioinformatics » The MPI Bioinformatics Toolkit for protein sequence analysis » Tool sections

Tool sections
- The MPI Bioinformatics Toolkit for protein sequence analysis

The search section contains popular search tools, such as NucleotideBLAST,ProteinBLAST (11), PSI-BLAST (12), and HMMER (13), as well asour in-house developments such as HHpred, HHsenser and PatternSearch.In comparison with the NCBI server, our BLAST tools offer greaterflexibility and functionality: searches can be run against uploadedpersonal databases or selectable sets of genomes (updated weeklyfrom NCBI and ENSEMBL), databases can be switched between PSI-BLASTruns, alignments can be extracted, viewed online or forwardedto other tools, and two graphs show matched regions and E-valuedistributions. The fastHMMER tool performs HMMER searches ofall standard sequence databases in ~10% of the time by reducingthe database with one iteration of PSI-BLAST at a cut-off E-valueof 10 000. PatternSearch identifies sequences containing a user-definedProsite pattern or regular expression. HHpred is a new serverfor protein structure and function prediction (5). It takesa query sequence as input and searches user-selected databasesfor homologs with a new and very sensitive method based on pairwisecomparison of hidden Markov models (HMMs). Available databases,among others, are InterPro, CDD and an aligment database webuild from Protein Data Bank (PDB) sequences and which can beused for 3D structure prediction. HHsenser is a transitive searchmethod based on HMM-HMM comparison (7). This method utilizesa sequence as input and builds an alignment with as many nearor remote homologs as possible, often covering the whole proteinsuperfamily.

The alignment section includes the well-known, popular multiplealignment program ClustalW (14), together with the more recentlydeveloped multiple alignment methods ProbCons (15), MUSCLE (16)and MAFFT (17). Also in this section is Blammer (10), whichconverts BLAST or PSI-BLAST output to a multiple alignment byrealigning gapped regions using ClustalW and removing localinconsistencies through comparison with an HMM. HHalign alignstwo alignments with each other by pairwise comparison of HMMsand displays similarities in a profile–profile dotplot.

In the sequence analysis section, we have grouped tools forrepeat identification and analysis of periodic regions in proteins.HHrep is a server for de novo repeat detection that is verysensitive in finding proteins with strongly diverged repeats,such as TIM barrels and ß-propellers (6). REPPER (8)analyzes regions with short gapless repeats in protein sequences.It finds periodicities by Fourier transform and internal sequencesimilarity. The output is complemented by coiled-coil predictionand secondary structure prediction using PSIPRED (18). Aln2Plotshows a graphical overview of average hydrophobicity and sidechain volume in a multiple alignment.

In the secondary structure section, Quick2D integrates the resultsof various secondary structure prediction programs, such asPSIPRED (18), JNET (19) and PROFKing (20), the transmembraneprediction of MEMSAT2 (21) and HMMTOP (22) and the disorderprediction of DISOPRED (23) into a single colored view. TheAlignmentViewer clusters sequences by a sequence idenity criterion,annotates groups of sequences using PSIPRED and MEMSAT2 predictionsof a multiple alignment and graphically displays the resultsin an interactive Java applet.

The tertiary structure section contains Modeller (24) and HHpred(5). Modeller is a very popular program for comparative modeling.It generates a 3D structural model from a sequence alignmentof a protein sequence with one or more structural templates.In contrast to the standalone version of Modeller, the inputformat does not need to be PIR but can also be FASTA or mostother standard multiple alignment formats. Modeller is tightlyintegrated with HHpred, allowing selected hits of HHpred resultsto be used as templates for subsequent comparative modeling.On the results page, models can be evaluated by using a browser-embedded3D-viewer and charts with output from several model qualityassessment programs are provided. This allows fast interactiverefinement cycles of the underlying multiple sequence alignment.The page also provides a link to the iMolTalk server, whichoffers several additional tools for the detailed analysis ofstructures and models (25,26).

In the classification section, we offer modules of the widelyused phylogenetic analysis suite PHYLIP (27), the ANCESCON package(28) for distance bases phylogenetic analysis and CLANS (9).CLANS clusters user-provided sequences based on BLAST pairwisesimilarities (29). The results can be analysed with a CLANSJava applet or can br exported to CLANS format.

Finally, in the utilities section there is a collection of toolswhich help to perform simple tasks that the user will oftenbe confronted with. It includes a sequence reformatting utility,a six-frame translation tool for nucleotide sequences, Extract_gisfor the extraction of gi-numbers from BLAST files, the RetrieveSeqtool for identifier-based sequence retrieval from the non-redundantprotein or nucleotide databases at NCBI, gi2Promotor for theextraction of nucleotide sequences upstream of genes identifiedby the gi-numbers of their encoded proteins and a backtranslationtool.

rating: 4.00 from 2 votes | updated on: 28 Oct 2008 | views: 12626 |

Rate article: