such as "Introduction", "Conclusion"..etc
An information management system for experimental data mustrecord data provenance and experimental conditions, maintaindata integrity as various numerical transformations are performed,describe data in terms of a standardized terminology, promotedata reuse and facilitate data sharing. The most common wayto achieve these requirements is via a relational database managementsystem (RDBMS, see SBEAMS—http://www.sbeams.org—orBioinformatics Resource Manager for relevant examples; Shahet al., 2006). Databases in biology resemble those previouslydeveloped for business and have proven spectacularly successfulin managing data on DNA and protein sequences. In a relationaldatabase, the subdivision of information and its subsequentstorage into cross-indexed tables follows a precise, predefinedschema. The granularity and stability of the schema allows anRDBMS to identify and maintain links between disparate piecesof information, even in the face of frequent read–writeoperations. However, this power comes at a considerable costin terms of inflexibility. It is difficult for a relationaldatabase to accommodate frequent changes in the formats of dataor metadata, and to incorporate unstructured information.
Whereas the sequence of a human gene represents valuable informationindependent of how sequencing was performed or of the individualfrom whom the DNA was obtained (a statement that remains truedespite the value of characterizing sequence variations); suchis not the case for measures of protein activity or cellularstate. Such biochemical and physiological data are highly contextdependent. Data on ERK kinase activity, for example, is uninformativein the absence of information on cell type, growth conditions,etc. Moreover, a wide range of techniques are used to make biochemicaland physiological measurements, and both the assays and thedata they generate change over time, as new methods are developed(e.g. in imaging see Swedlow et al., 2003). Context dependenceand rapidly changing data formats pose fundamental problemsfor databases because RDBMS schemes are not easily modified.
Moreover, even if effective metadata standards are developedto describe the context-dependence of experimental findings,data from different experiments cannot be reconciled simplyby storing them in a single database. Subtle distinctions mustbe made about different types of data and biological insightbrought to bear. Currently this is performed implicitly in theminds of individual investigators, but we envision a futurein which the unique ability of mathematical models to formalizehypotheses and manage contingent information makes them theprimary repositories of biological knowledge. As we work towardsa model-centric future, it is our contention that informationsystems based solely on relational databases are unnecessarilylimiting; rarely do we modify a difficult experiment simplyto conform to a pre-existing database schema (whereas conformityto uniform—even arbitrary—standards is a strengthfor a business database). New approaches to data managementthat reconcile competing requirements for flexibility and structureare required.
Enter the code exactly as it appears. All letters are case insensitive.