table of contents table of contents

Home » Biology Articles » Bioinformatics » Introductory Workbook on Perl for Biology Students » III. Syntax

III. Syntax
- Introductory Workbook on Perl for Biology Students

Scalar variable stores a single (scalar) value. Perl scalar names are prefixed with a dollar sign ($), so for example, $x, $y, $z, $username, and $url are all examples of scalar variable names.


Array variable is capable of storing more than one value. The array names are prefixed by an ‘@’ sign. For example @x, @y, @z. The values assigned to the array should be separated by commas or by spaces (if they are separated by spaces then the values should be prefixed with “qw”). You could also specify ranges ([“a”..”z”] or [1..20]). A point to note is there should be only 2 dots.


@y=("Quarter","Dime","Nickel"); Numbers don’t have quotes but

 strings(characters) do.

$y[1] while retrieving a value of the array you can address

 them by their position in the array and placing a $ or @

 in the front as it holds only one value. Here the result

 is “Dime”.

@z= qw(value1 value2);

@a=(0..10); Even ranges can be specified. Here the range is

            from 0 to 10


Hash variable A Perl hash is similar to an ordinary array, but instead of using integer indexes, a hash uses "keys" that can take on any scalar value. These are usually strings or numbers. The Syntax is: instead of the @ operator, associative arrays use the % symbol, and rather than square brackets [], as in $myarray[0], hash elements are referenced using curly brackets {}, as in $myhash{"george"}


Subroutine is a user defined function to perform a particular or a set of tasks. The subroutine could be placed anywhere in the program body, but usually it is best to put them either at the beginning or at the end of the program


The syntax of the subroutine is defined as (as seen in line 11 of program 2):

sub <name_of_the_subroutine>{

      <body of the subroutine>



Substitution is a method by which you can substitute a word with a new word. If the new word is empty, then the word is substituted by null. In other terms deleted as null has no value. The syntax for substitution is given as:

s/<word>/<new word>/<options> as in line 13 of program 3

The option are:

g – Global substitution. That is to substitute the old word

with the new word in the whole value of the variable

i – Ignore case. That is for example you can substitute the

word “bIO” to “bioinfo” and also any occurrences of


If non of the options are specified then the command substitutes only the first occurrence of the word in the variable.


Translation is a method by which you can translate a character to a new character, but you don’t really substitute it. If you give a word, then each letter of the word is translated to the corresponding new letter of the new word. If the new letter is empty then the letter is substituted by null, therefore deleted.The syntax for translation is given as:

tr/<letter>/<new letter>/ as in line 6 to 9 of program 3

You can also specify range such as tr/[a-z]/[A-Z] which is used to change to uppercase.


While loop. The While Loop is used to perform an operation when the condition provided is true or in other terms the loops is executed until the condition turns false. The syntax is:







Do while loop. To create a loop that executes statements first, and then tests an expression, you need to combine while with a preceding do {} statement. For example:


            $calc +=($fact*$ivalue);

   }while ($calc<100);


In this case, the code block is executed first, and the conditional expression is only evaluated at the end of each loop iteration. Here we multiply $fact and $ivalue and add it to the value $calc and save the result in $calc. The condition is then checked in the end that is to execute the loop until $calc is less than 100. Once it is equal or greater than 100, the loop breaks.


Until Loop. Its syntax is:





Until the condition is met the loop keeps executing. Once the condition is met the loop breaks or exits. It is the inverse of the while loop


Do until. Here The statements are first executed and then the condition is checked. The syntax is:








  $calc += ($fact*$ivalue);

} until ($calc >= 100);


Here we multiply $fact and $ivalue and add it to the value $calc and save the result in $calc. The unitl condition then checks if the $calc is greater than or equal to 100. If not the loop is again executed and terminates only when the condition is satisfied.


For loop. Its syntax is:


for(<initialization>; <condition>; <increment>){




for($i=0; $i<10; $i++){

      print “the value of i is $i”;



In the above example we initialize the value of $i to 0 and then print the value of ‘i’ until the condition is met. That is until the value of $i has become less then 10. Each time the loop is executed the value of $i is incremented by 1 due to the $i++ element. If you want to increment it by 2, then you write $i+2. $i++ is only another form for writing $i+1.


Foreach loop. The Foreach Loop is used along with arrays so that each value of the array undergoes the desired operation using an iterator. The syntax is:






Pop function. The pop function is to eliminate the last value of an array. The syntax is:



Push function. The push function is to push a value to the last position of an array. The syntax is:

push(@array, ‘value’);


Sort function. The sort function is to alphabetically sort the values in the array. The syntax is:



Open. The function open uses a file handler to open a file and is used for reading and writing to a file. The Open command ends with a use of another command called “close”. The syntax is:


open(<file handler>, <name of the file>)



close <file handler>;

The example of the command is shown in program 8.


Chomp. It takes off the end character of a specified string ONLY if that character is a RETURN (Enter). The return character is sometimes created from input information or by the coding itself. Either way, to parse that character off, CHOMP is the command to use. IT will not affect any other characters.

Note: The RETURN character is the same as the ENTER character which is also known as a NEWLINE character. It is symbolized as \n. The syntax is:


chomp(<variable>) example:- chomp($dna); or chomp(@dna).


Chop. The next command is CHOP. This is a very similar command as CHOMP, but... it takes of the ending character of a string no matter what it is.


chop(<variable>) example:- chop($dna); or chop(@dna);


Join. It takes two arguments - a scalar variable to use as a separator (not a regular expression just a simple variable) and an array, and it returns a string that contains the elements of the array separated by the given separator. The syntax is:


This LIST could be a set of scalar variables or just an array

join(EXPR, $a,$b,$c); or join(EXPR, @array);

example:-@array=(“one”, “two”)

 print join("_", @array);

result:     one_two


Split. It breaks up a string according to a separator. This is useful for tab-separated data, or colon-separated, whitespace-separated, or anything-separated data. If you can specify the separator with a regular expression, you can use split: The syntax is


@array_variable = split(/separator/, string);

Example:-my $str = "one:two:three:";

@fields = split(/:/, $str);

print @fields;

result:- one two three


Example2:- my $str = "one two three:";

@fields = split (//,$str); that is each of the three words have become a value in the array so the array has now three values and they are not anymore as a sentence.

print @fields;

result:- one two there



If elsif else. The syntax is:

if (boolean expression) {



If the boolean expression evaluates to true, the statements in the curly brackets will be executed. The curly brackets are mandatory. The boolean expression conditional can contain any one of the comparison operators covered in the next section.

Multiple conditions can be checked together using the boolean expression operators:

  • && - logical and, C style; used for most conditionals
  • and - logical and, but with a lower precedence; used for flow control
  • || - logical or, C style; used for most conditionals
  • or - logical or, but with a lower precedence; used for flow control
  • ! - logical not, C style
  • not - logical not, but with a lower precedence


if ( ($x == 20) || ( ($x > 0)&&($x < 10)&&(! $x == 5) ) ){

    print "x is equal to 20 or either between 0 and 10 but not 5.\n";


Conditional statements can also be extended with the elsif and else structures:

if (boolean expression 1) {

    statement 1;


elsif (boolean expression 2) {

    statement 2;


else {

    statement 3;


Note that an if statement is followed by any number (including zero) of elsif statements, and finally an optional else statement. The statements of an elsif will be executed if its boolean expression is true, and no preceding (els)if statement's boolean expression is true. The trailing else (if present) is executed if none of the preceding statements' boolean expressions are true.


Unless. The syntax is:

unless (Boolean expression) {



Unless the Boolean expression is true, the statements within the curly brackets are executed. The “Unless” command is not a loop command.


Substr. It returns a substring of EXPR, starting at OFFSET within the string. If OFFSET is negative, starts that many characters from the end of the string. If LEN is specified, returns that number of bytes, or all bytes up until end-of-string if not specified. If LEN is negative, leaves that many characters off the end of the string.

If REPLACEMENT is specified, replaces the substring with the REPLACEMENT string.

If you specify a substring that passes beyond the end of the string, it returns only the valid element of the original string.

The syntax is:






$temp = substr("okay", 2);

print "Substring value is $temp\n";


$temp = substr("okay", 1,2);

print "Substring value is $temp\n";


result:- Substring value is ay

Substring value is ka

rating: 3.70 from 116 votes | updated on: 30 Jan 2009 | views: 128104 |

Rate article: