Genome annotation and curation
For each gene in the PBRC database, we provide an automated,computer-driven annotation that gathers as much basic descriptiveinformation as possible about a gene, basic analysis of itsnucleotide or amino acid sequence, and the results of sequencesimilarity searches to look for common patterns or featuresthat might be characteristic of its function. The annotationprocess starts with the GenBank record and includes the descriptiveinformation, literature references and any other informationprovided in that record. This information populates the initialdescriptive fields of our database. Following this automatedannotation process, a manual, human-directed curation of eachgene record is undertaken. During this curation process, a researcherreviews the annotation record, all available literature referencesand any unpublished information as available. This collectionof empirically derived properties for the protein in questionprovides what might be considered a mini-review of the biologyof the gene being studied. The broad types of information thatare provided during the curation process include protein propertiessuch as molecular weight and pI; post-translational processing;the availability of custom reagents such as clones, antibodiesand mutants; functional descriptions [including Gene Ontologydesignations (16)]; and literature summaries. Evidence codesare provided that explicitly state the nature and source ofeach piece of information along with the appropriate literaturereferences. A series of web forms assist in this process thatprovides a distinct set of informational fields to be filledin, and enforces use of a controlled vocabulary to fully describeeach gene. The results of the curation process are stored inour SQL Server database and form a Poxvirus Knowledge Databasethat is available and searchable from the PBRC website.

