Perl script is designed to visualize virtually the features
of nucleotide strings. In the present endeavour, a few exercises are conceived
to unravel the inherent features of DNA using perl script. The perl programme
could be downloaded from internet and further the same can be installed in UNIX
(kubuntu) platform. In the present book-let, the exercises began with the
elementary arithmetic calculations and extended the same to the DNA
string. Incidentally, DNA string is one
of the best biological core materials to adopt perl programme to unfold its
salient features, whose vast array of information have been deciphered over the
years through various biochemical tools. The language of DNA infused with four
nucleotides and in a unique species-specific combination constitutes genomic
DNA and the same also happens to be the source for proteins and disease causing
malformed proteins. Hence, the central dogma of transcription of gene and
translation of gene products are the crux for the computer languages to percolate
into the biological systems. Moreover, the number of nucleotides in any species
is beyond arithmetic proportion and the number of variations i.e., polymorphism
within the genes and the foot prints for the transcription factors and
enzymatic machinery - are all tending to increase the complexity of the
function of genomic DNA in vivo.
In such a scenario, the computer programme such as perl has come to the
rescue of biologists to unravel the mysteries of creation. The present book-let
deals with a few exercises viz., in silico evaluation of DNA properties
such as complementary strand, transcription of RNA, identification of start and
stop signals, finding out percent GC and length of the each strand,
concatenating two strings, joining two stings and chopping the terminals of RNA
and translation of DNA genetic code into a protein string using the syntax of
Perl is a scripting language, developed by Larry Wall in
1987, who designed perl language for UNIX environmental system. Perl is an
acronym (precisely, retronym) and stands for Practical Extraction and Report
Language. In the jargon of computer
science, the scripting languages are often called interpreted languages. However, perl is both a compiled and
interpreted language and hence facilitates to modify the perl scripts
instantaneously than in any other programming languages. Perl is ubiquitous and
a powerful language which assists to write structured programmes, advanced data
structures and object oriented programme.
Unix (kubuntu) administrator is one of the best choices to
write perl script. The first line of the
script is called ‘shebang' with hash
and exclamation mark (#!/usr/bin/per w).
The symbol ‘#' at the start of the line indicates that the respective
line constitutes a comment. The programme lines prefixed with $ and @
constitute either commands or arguments. Two strings of random nucleotides'
sequences are written in a text file as shown in Appendix. Later, executable text files are named as
‘filename.pl', i.e., individual exercises.
The retrieving of strings of nucleotides from the text file is done
using scalar variable commands, followed by ‘chomp' and ‘join' functions to
make the array of strings as a single string.
The present conceived exercises are designed to work with one string and
also array of variables. They represent
scalar data and list data respectively.
The simplest example chosen here is the two strings of nucleotides
written in a text file "krupa.seq". A few short steps in perl programme are
designed to retrieve one string to begin with and later two strings using ‘$'
and ‘@' array commands respectively. The
two strings are brought into one continuous line and made them joined using
perl commands such as ‘chomp' and ‘join' respectively. Considering the joined string as a template,
the following parameters of DNA are virtually derived viz., length,
complementary strand, transcribed strand, substitution of ‘start' and ‘stop'
codons, total count of nucleotides in the transcribed string, GC count, and
percent of GC and chop function. The last exercise deals with the retrieving of
nucleotide sequences of the gene of our choice from internet using Web sources.
These exercises invariably provide an impetus for the beginners to practice the
perl script to visualize in silico features of DNA.