The
possibilities and proposals towards computational processing of genome
data as discussed appear mind boggling at this stage, but ultimately
scientists will be empowered to swiftly interpret their own
experimental results within the context of other published research
findings in a more interactive and collaborative way. These advances
underscore the need for important biological information such as genome
sequences and microarray data sets to be made freely available and the
literature describing the data interpretation to be available through
Open Access platforms such as PLoS ONE. Since PLoS ONE publishes
research through extensible markup language (XML), it is possible to
quickly exchange experimental results and their interpretations across
different platforms. This in turn simplifies utilization and processing
of genomic information contained in research publications so that
details such as decipherment of novel pathways or evolutionary
relationships etc. could be discussed globally and interpreted through
community genomics environments.
To this end, ‘PLoS
ONE prokaryotic genomes collection’ represents a novel initiative to
compile a permanent archive of all important articles describing whole
genome sequence based biology of prokaryotic organisms. This collection
of articles will facilitate understanding of the biology and lifestyle
of the underlying organisms not only through main contents of articles
but also via information from external sources that discuss and
link to the results, such as citations from PubMed Central, Google
Scholar and Scopus; evaluations and ratings at Faculty of 1000;
bookmarks from social networking sites such as CiteULike and Connotea;
and blog posts from experts and readers in the field. Just like other
PLoS content, it will be possible to make utilization of individual
articles interactive for the users (human or machine) to harness
elements of research (annotation tables, phylogenetic trees,
evolutionary hierarchies, gene expression data, graphs, texts etc.) and
associated content in the form of relevant discussions (and raw data
posted in response to a discussion). This content can be processed in a
variety of computational formats such as graphs or networks that can be
inspected visually, cured manually or mined computationally. Linking
therefore the secondary contents and Science 2.0 based enhancements to
published information and their subsequent harnessing through different
knowledge-platforms is likely to underpin formation of new ideas and
insights in a more holistic and interdisciplinary manner. Such novel
theses in the form of alternative or even more provocative
interpretations could ultimately be linked back to the original genome
sequences thus completing a cycle of information sharing through Open
Access.
Acknowledgments
I am thankful to Professor Seyed E. Hasnain for his guidance and support.