Perl basics

So far, we have been looking at shell programming for performing fairly simple tasks. Now let's extend the idea of shell programming to cover more complex tasks like systems programming and network communications. Perl is a language which was designed to retain the immediateness of shell languages, but at the same time capture some of the flexibility of C. Perl is an acronym for Practical extraction and report language. In this chapter, we shall not aim to teach Perl from scratch - the best way to learn it is to use it! Rather we shall concentrate on demonstrating some principles.

Sed and awk cut and paste

One of the reasons for using Perl is that it is extremely good at text file handling-one of the most important things for UNIX users, and particularly useful in connection with CGI script processing on the World Wide Web. It has simple built-in constructs for searching and replacing text, storing information in arrays and retrieving them in sorted form. All of the these things have previously been possible using the UNIX shell commands

             Sed

             awk

             cut

             paste

But these commands were designed to work primarily in the Bourne shell and are a bit 'awk' ward to use for all but the simplest applications.

          'sed'                     is a stream editor. It takes command line instructions, reads input from the stream stdin and produces output on Admit according to those instructions. `sed' works line by line from the start of a text file.

         `awk'                    is a pattern matching and processing language. It takes a text file and reads it line by line, matching regular expressions and acting on them. 'awk'  is powerful enough to have conditional instructions like 'if ..then..else' and uses C's `printf` construction for output.

         `cut'                    Takes a line of input and cuts it into fields, separated by some character. For instance, a normal line of text is a string of words separated by spaces. Each word is a different field. 'cut' can he used, for instance, to pick out the third column in a table. Any character can he specified as the separator. `paste' is the logical opposite of cut. It concatenates it files, and makes each line in the file into a column of a table For instance. 'paste one two three' would make a table in which the first column consisted of all lines in 'one', the second of all lines in 'two' and the third of all lines in 'three'. If one file is longer than the others, then some columns have blank spaces.

Program Structure

To summarize Perl, we need to know about the structure of a Perl program, the conditional constructs it has, its loops and its variables. In the latest versions of Perl (Perl 5), you can write object oriented programs of great complexity. We shall not go into this depth, for the simple reason that Perl's strength is not as a general programming language but as a specialized language for text file handling. The syntax of Pert is in many ways like the C programming language. But there are important differences

• Variables do not have types. They are interpreted in a context sensitive way. The operators which acts upon variables determine whether a variable is to be considered a string or as an integer etc.

• Although there are no types, Perl defines arrays of different kinds. There are three different kinds of array, labelled by the symbols `$`, '@' and `%`

• Perl keeps a number of standard variables with special names e.g.'$_@ ARGV' and Special attention should be paid to these. They are very important!

• The shell reverse apostrophe notation 'command' can be used to execute UNIX pro-grams and get the result into a Perl variable.

Here is a simple 'structured hello world' program in Perl. Notice that subroutines / Perl function are called using the '&' symbol. There is no special way of marking the main program - it is simply that part of the program which starts at line 1.

             #!/local/bin/perl
             # Comments
             &Hello();
             &World;
             # end of main
             sub Hello
              {
                  print "Hello";
              }
             sub World
              {
                  print "World\n";
              }

The parentheses on subroutines are optional, if there are no parameters passed. Notice that each line must end in a semi-colon.

Scalar variable

In Pell, variables do not have to be declared before they are used. Whenever you use a new symbol, Perl automatically adds the symbol to its symbol table and initializes the variable to the empty string. It is important to understand that there is no practical difference between zero and the empty string in Perl - except in the way that you, the user, choose to use it. Perl makes no distinction between strings and integers or any other types of data - except when it wants to interpret them. For instance, to compare two variables as strings is not the same as comparing them as integers, even if the string contains a textual representation of an integer. 'Like a look at the following program.

               #!/local/bin/perl
               # Nothing!
               print "Nothing== $nothing\n";
               print "Nothing is zero!\n" if ($nothing == 0);
               if ("$nothing eq")
               {
                         print STDERR "Nothing is really nothing!\n";
               }

               $nothing = 0;
               print "Nothing is now $nothing\n" ;
               The output from this program is
               Nothing ==
               Nothing is zero!
               Nothing is really nothing!
               Nothing is now 0

There are several important things to note here. First of all, we never declare the variable 'nothing'. When we try to write its value, Perl creates the name and associates a NULL value to it i.e. the empty string. There is no error. Perl knows it is a variable because of the '$' symbol in front of it. All scalar variables are identified by using the dollar symbol. Next, we compare the value of $nothing' to the integer '0' using the integer comparison symbol '==', and then we compare it to the empty string using the string comparison symbol 'eq'. Both tests are true! That means that the empty string is interpreted as having a numerical value of zero. In fact any string which does not form a valid integer number has a numerical value of zero. Finally we can set $nothing' explicitly to a valid integer string zero, which would now pass the first test, but fail the second.

Perl array (vector) Variables

The complement of scalar variables is arrays. An array, in Perl is identified by the '0' symbol and. like scalar variables, is allocated and initialized dynamically.

          array [0] = "This little piggy went to market";
          array [2] = "This little piggy stayed at home";
          print " @array [0] °array [1] @array [2] " ;

The index of an array is always understood to be a number, not a string, so if you use a non-numerical string to refer to an array element, you will always get the zeroth element, since a non-numerical string has an integer value of zero.An important array which every program defines is

         @ARGV

 This is the argument vector array, and contains the commands line arguments by analogy with the C-shell variable '$argv[]' Given an array, we can find the last element by using the ‘$#’ operator. For example,    

                                   

            $last_element=$ARGV[$#ARGV];

Notice that each element in an array is a scalar variable. The '$W cannot be interpreted directly as the number of elements in the array, as it can in the C-shell. You should experiment with the value of this quantity - it often necessary to add 1 or 2 to its value in order to get the behavior one is used to in the C-shell. Perl does not support multiple-dimension arrays directly, but it is possible to simulate them yourself. (See the Perl book.)

Special Array Commands

The 'shift' command acts on arrays and returns and removes the first element of the array. Afterwards, all of the elements are shifted down one place. So one way to read the elements of an array in order is to repeatedly call 'shift'.            

            $next _element =shift (@myarray) ;

Note that, if the array argument is omitted, then 'shift' works on '@ARGV ' by default.

 Another useful function is 'split', which takes a string and turns it into an array of strings. 'split' works by choosing a character (usually a space) to delimit the array elements, so a string containing a sentence separated by spats would be turned into an array of words

The syntax is

        @array = split;                                  #works with spaces on $_
        @array = split (pattern, string);                #Breaks on pattern
       ($v1 ,$v2 . . . ) = split (pat tern, string);     #Name array elements with scalars

In the first of these cases, it is assumed that the variable `$_' is to be split on whitespace characters. In the second case, we decide on what character the split is to take place and on what string the function is to act. For instance

     @new _array split (" : " , "name : passwd : uid: gid: gcos : home : shell");

 The result is a seven element array called `@new_array'. where ‘@new_array[0]’  is 'name' etc.

Associated Arrays

One of the very nice features of Perl is the ability to use one string as an index to another string in an array. For example, we can make a short encyclopedia of zoo animals by constructing an associative array in which the keys (or indices) of the array are the names of animals, and the contents of the array are the information about them.

     $animals{"Penguin"} = "A suspicious animal, good with cheese crackers...";
     $animals{"dog"} = "Plays stupid, but could be a cover...";
     if ($index eq "fish")
     {
       $animals{$index} "Often comes in square boxes. Very cold.";
     }

Array Example Program

Here is an example which prints out a list of files in a specified directory; in order of their UNIX protection bits. The least protected file files come first.

#Demonstration of arrays and associated arrays.

# Print out a list of files, sorted by protection, # so that the least secure files come first.

 # e.g.       arrays <list of words>

#                  arrays *.0

         #!/local/bin/perl
         print "You typed in ",$#ARGV+1," arguments to command\n";
         if ($#ARGV < 1)
         {
            print "That's not enough to do anything with!\n";
         }
         while ($next_arg = shift(@ARGV))
         {
            if ( ! ( -f  $next_arg || -d  $next_arg))
            {
               print "No such file: $next_arg\n";
               next;
            }
            ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size) = stat($next_arg); 
            $octalmode = sprintf("%o",$mode 8t 0777);
            $assoc_array{$octalmode}.= $next_arg.":size(".$size."),mode(".$octalmode.")\n";
          }
          print "In order: LEAST secure first!\n\n";
          foreach $i (reverse sort keys(%assoc_array))
          {
             print $assoc_array{$i};
          }

Loops and Conditional

Here are some of the most commonly used decision-making constructions and loops in Perl. The following is not a comprehensive list - for that, you will have to look in the Perl bible: Programming Perl by Larry Wall and Randal Schwartz. The basic pattern.

 follows the C programming language quite closely. In the case of the 'for' loop, Perl has both the C-like version. called 'for' and a `foreach' command which is like the C-shell implementation.

Perl for loop

The for loop is exactly like that in C or C++ and is used to iterate over a numerical index, like this:

           for ($i 0; $i < 10; $i++)
           {
              print $i, "\n";
           }

Perl foreach loop

One of the main uses for 'for' type loops is to iterate over successive values in an array. This can be done in two ways which show the essential difference between for and foreach. If we want to fetch each value in an array in turn, without caring about numerical indices, the it is simplest to use the foreach loop.

           @array = split (" ","a b c d e f g");
           foreach $var (@array )
           {
                 print $var, "\n";
           }

Iterating Over lines in a file

Since Pearl is about file handling we are very interested in reading file. Unlike C and C++.Perl likes to read files line by line. The angle brackets are used for this, See(Undefined)[Files in perl],page(Undefined). Assuming that we have some file handle ‘<File>’, for instance ‘<STDIN>’, we can always read the file line by line with a while-loop like this.

Note that $line includes the end of line character on the end of each line. If you want remove it, you should add a ‘chop’ command.         

        while($line=<file>)
        {
           Chop $line;
           Print “line=($line)\n”;
        }