What Is a Module?

Modules are an important and powerful part of the Perl programming language. A module is a named container for a group of variables and subroutines which can be loaded into your program. By naming this collection of behaviors and storing it outside of the main program, you are able to refer back to them from multiple programs and solve problems in manageable chunks.

Modular programs are more easily tested and maintained because you avoid repeating code, so you only have to change it in one place. Perl modules may also contain documentation, so they can be used by multiple programmers without each programmer needing to read all of the code. Modules are the foundation of the CPAN, which contains thousands of ready-to-use modules, many of which you will likely use on a regular basis.

Exploring Existing Modules

Before you begin writing your own modules, you need to know how to use modules and how perl finds modules to load. The Perl core distribution comes with many useful modules, so we'll take one called Digest::MD5 as an example and see how to use it.

Using your terminal, run the command `perldoc Digest::MD5`. You should see something like the picture. What you see there is the documentation for the Digest::MD5 module. Notice the example:

  use Digest::MD5 'md5_hex';
  ...
  $digest = md5_hex($data);

About the MD5 Digest

The MD5 algorithm is a cryptographic hash function, meaning that it reliably generates a fixed-length "fingerprint" given arbitrarily large data as input. Further, the fingerprint cannot easily be generated from some other input, which makes this fingerprint very useful in login systems. When you implement a login system (such as a user on a website or a multi-user computer), you want the user to know their password, but it's best if that password isn't left lying around somewhere on the computer where others might find it. A cryptographic hash like the MD5 digest is a way to verify that the user's password is correct without storing it. You store a digest of Jim's password. When Jim tries to login, compare a digest of the password he types to the one you have stored. Because the digest is irreversible, anyone who gets access to your password file or database will not be able to simply login as Jim.

Aside: This algorithm is also a good example of why modules are important. When your code uses the Digest::MD5 module, you know that it will always create MD5 digests in the same way. Without modules, different programs might contain duplicate code or various re-implementations of the algorithm.

Loading Modules

You may now need to hit the 'q' key to exit the perldoc viewer. Note that the documentation contains a description of the module, examples of how to use it, and an overview of what it does.

Most module documentation will include an example section like this which shows a typical use statement. The use command tells perl to find and load the module named Digest::MD5 and then to invoke Digest::MD5->import('md5_hex'). That import() installs the function Digest::MD5::md5_hex into your current program. Open your text editor and create the following md5sum.pl file.

  #!/usr/bin/env perl

  use v5.10.0;
  use warnings;
  use strict;

  use Digest::MD5 'md5_hex';

  say md5_hex("somethingsomething");

Now try running it. On a unix-like system, you should also have a binary named 'md5sum' available. Notice how the outputs of our md5sum.pl and the system command are the same (except that the system md5sum outputs a '-' representing the "filename" of the piped-in data.)

Behind the Curtain

When you use a module, perl is following a built-in plan for how to find and load the correct package. The first thing it does is to search for a .pm ("Perl Module") file in the directories listed in the global @INC array. Let's print that array and see what it contains. (Yours will likely be different than mine.)

  $ perl -E 'say for @INC'
  /usr/local/lib/perl/5.10.0
  /usr/local/share/perl/5.10.0
  /usr/lib/perl5
  /usr/share/perl5
  /usr/lib/perl/5.10
  /usr/share/perl/5.10
  /usr/local/lib/site_perl
  .

The @INC array is set when perl is compiled, but can be altered by the program, with the -I command-line switch, or by the PERL5LIB environment variable.

  $ PERL5LIB=also perl -Ilib -E 'say for @INC'
  lib
  also
  /usr/local/lib/perl/5.10.0
  /usr/local/share/perl/5.10.0
  /usr/lib/perl5
  /usr/share/perl5
  /usr/lib/perl/5.10
  /usr/share/perl/5.10
  /usr/local/lib/site_perl
  .

When we say use Digest::MD5, perl starts at the first @INC entry and searches for a Digest/MD5.pm file. Note that module names are always written with '::' separators in code, but on your filesystem this becomes the directory separator (e.g. '/'.)

If there are more Digest/MD5.pm files later in @INC, perl will simply ignore them. This masking effect allows you to install a newer version of a module which will take precedence over the one shipped with Perl.

Once perl finds and loads your module, it stores the full filename for the loaded module in the global %INC hash. You can look in this hash to see which file was loaded (very handy when you want to read the module's source code.)

  $ perl -MDigest::MD5 -E 'say $INC{"Digest/MD5.pm"}'
  /usr/lib/perl/5.10/Digest/MD5.pm

Creating a Login System

If we're building a login system, one thing we'll need is a password data file which contains a login and hashed-password for each valid user. Use your editor to create the file 'passwords' with a 'jim' user and our md5_hex output for the password 'somethingsomething'.

Now we need some code to challenge a user to login and check whether they are 1) a valid user and 2) know the correct password.

  #!/usr/bin/env perl

  use v5.10.0;
  use warnings;
  use strict;

  use Digest::MD5 'md5_hex';

  open(my $fh, '<', 'passwords') or die "cannot open passwords file $!";
  my %passwords = map({chomp; split(/:/, $_, 2)} <$fh>);

  my $user;
  while (1) {
    print "Username: ";
    chomp($user = <>);
    print "Password: ";
    chomp(my $pass = <>);

    # they must be a valid user and
    # their digested password must match the stored digest
    last if(
      $passwords{$user} and
      md5_hex($pass) eq $passwords{$user}
    );

    # otherwise, we're stuck in the loop
    say "Sorry!";
  }

  say "Congratulations $user!";

If the open, map, chomp, split, and <> constructs are new to you, review the (todo: link) Text Processing Tutorial. The important thing to note here is that we have created a loop which will continue prompting until the user enters a correct login.

Now, we have created some behavior which will be useful to reuse: loading the password file and verifying a username/password. We want to package this whole idea into a module so it can be used whenever we need to check a user's credentials.

First, your module needs somewhere to live. Create a directory named lib/TestSite.

  $ mkdir -p lib/TestSite

Then use your editor to create the file lib/TestSite/Login.pm.

  package TestSite::Login;
  $VERSION = v0.0.1;

  use v5.10.0;
  use warnings;
  use strict;
  use Carp;

  use Digest::MD5 'md5_hex';

  open(my $fh, '<', 'passwords') or die "cannot open passwords file $!";
  my %passwords = map({chomp; split(/:/, $_, 2)} <$fh>);

  sub check_password {
    my ($user, $pass) = @_;

    return(
      $passwords{$user} and
      md5_hex($pass) eq $passwords{$user}
    );
  }

  1;

The module consists of a few more lines of declaration. First, we have the package statement, which declares the namespace of this module. The package name should match the file path (with the '/' exchanged for '::'.) Second, we have a $VERSION number, which every package should have (this helps when it comes time to deploy your code into the world.) After the version, we have the same set of use lines as our scripts, but we also include the Carp module, which helps with debugging. Jumping to the very end, every module must end in a true value (this satisfies a sanity check when it is loaded.)

Aside from the declarations, we've also migrated some code from the main program and changed the 'die' statement to croak() (this comes from the Carp module.) The only other change is to put our password-checking code into the check_password() subroutine.

Now we need to make some changes to our login.pl code to use this new module. We'll make a modification and name it login2.pl.

  #!/usr/bin/perl

  use v5.10.0;
  use warnings;
  use strict;

  use TestSite::Login;

  my $user;
  while (1) {
    print "Username: ";
    chomp($user = <>);
    print "Password: ";
    chomp(my $pass = <>);

    last if(TestSite::Login::check_password($user, $pass));

    # otherwise, we're stuck in the loop
    say "Sorry!";
  }

  say "Congratulations $user!";

Notice that we've removed the use Digest::MD5 ... line, because our new TestSite::Login module takes care of all of that.

When we run the new login2.pl, we need to tell perl to also look for modules in our lib directory so it will find lib/TestSite/Login.pm. You can install your module in one of the global @INC directories, but it is more convenient to use the -I switch when developing modules.

Now you have the same behavior as the original login.pl program, but the login2.pl program is shorter and focussed on the prompting while the TestSite::Login module is handling the details about how to validate a username and password. Even though we only use the TestSite::Login module in one program, it clarifies the code by separating the ideas and makes it easier to maintain. As you take on bigger programming challenges, modularizing behaviors helps to organize and simplify your code, while also allowing you to reuse it as needed.