PRAISE THE AUTOLOAD
This article is about the AUTOLOAD subroutine of Perl, which is rarely used in applications. You may wonder why I am only praising a seldom-used AUTOLOAD subroutine of Perl instead of Perl itself. First of all, I am not an authority to praise Perl or any other language for that matter. Am I an authority to praise the AUTOLOAD subroutine? Of course not. I am,however, praising AUTOLOAD, because of the things we’ve accomplished by using it in our projects.
For readers who are not familiar with AUTOLOAD, I am going to explain what it is and what it does.
PROBLEM
Our clients access the database through the Internet. This requires a lot of simple SQL statements; occasionally complex ones to be executed in Perl script. Since the majority of these SQL statements are simple SELECT statements, which are used to get certain fields from database, it doesn’t require much skill to write them. The complexity is to have many SQL statements in a program, which are written by different programmers with different programming styles at different times. As a result, the maintenance of some Perl scripts becomes unmanageable. If we could encapsulate this tedious work, the programs would become simpler. There were attempts to put frequently used SQL statements into functions, and the functions into Perl scripts to be used in other scripts. Even though this was a movement in the right direction, it created function proliferation. Slightly different SQL statements required having new functions.
We develop a lot of web applications. Our primary concern is not the efficiency of the programs we develop, but maintainability and fast development time. However, achieving these is a difficult task in practice for two reasons: first there is more than one way to program something in Perl; second, each programmer has his/her own style. We wanted to take the tedious work from the programs and encapsulate it into modules.
The object-oriented features of Perl were used to resolve the issue. To achieve this, we developed a module called Table, which encapsulates all the tags necessary to create HTML tables. Then we wrote a wrapper around the CGI.pm module and converted all Form Elements to objects with added functionality. Third we created Perl modules representing our database tables and used them to access the database. In this article I will talk about these database modules.
I will describe the concept by using a very simple database with two tables. An owner table and a car table. The owner table contains information about the car owners, and the car table contains information about cars. The relationships between these tables are assumed to be one to many. This means that one owner can have multiple cars however, a car cannot belong to multiple owners. Listing-1 shows these two tables.
To make programming tasks easier we made some assumptions. All primary keys are named as table name appended with "_id". For example the owner table’s primary key is owner_id and the car table’s primary key is car_id. All the foreign keys are named with the same convention if it is possible; If not possible, it will require extra coding in the modules that are not shown in this article. To apply this concept to an existing database that does not follow this convention, we created a hash with a table name as a key and primary key name as a corresponding value. We generally make slight modifications in the new method we are going to describe to accomplish this.
What we are going to describe here is actually applicable to large databases as well. The databases we are using contain many tables and we applied these concepts.
THE FIRST ITERATION
To demonstrate the concept for our simple database we are going to develop three modules called Business.pm, Owner.pm and Car.pm. Codes for these modules are given in Listing-2. Here Business.pm is the main module which has a constructor called new and this method gets information from the database and stores the result in the $self->{key_value} hash reference. ("keys" are field names from the database tables and "values" contains corresponding data from the database.) The other two modules inherit everything from the Business.pm module as shown in the @ISA array defined in Owner.pm and Car.pm. For the readers who are not familiar with inheritance in Perl, here is very simple explanation. Just create an @ISA array and put module names that you want to inherit from. If you put more than one name into an @ISA array, what you are doing is multiple inheritance.
Now it is easy to get information about an owner or a car by using these modules. The following code exemplifies a simple program accessing Owner information
use Business; use Owner; # these values might come from a cookie, from another object # or param method of CGI module my $owner_id = 1; my $owner = Owner->new($owner_id); print $owner->first_name; print $owner->last_name;
This is simple example that demonstrates all the major steps for getting data from database by using business objects. As you can see we did not write any SQL for this simple database access. We construct a business object by giving a primary key value as an argument to a new constructor. The constructor then generates an SQL statement that will get all the fields related to that primary key. The following is the SQL generated in the "new" method given in Listing-2.
my $sql = "SELECT * FROM $name WHERE $table_id = ‘$id’";
Where * shows that we want to get all the fields from a table stored in $name variable with the primary key given by $id. After this, $sql is used to construct a Query object called $qobj, which accesses the database and executes the given SQL statement. Notice that a key_value hash reference of $qobj is assigned to the key_value hash reference of business object. This hash reference contains all the field names and their values of the row given by the primary key. Note that no listing is given for Query.pm module. The reader can use the DBI module and do the following assignment
$dbh->prepare($sql);
$sth = $dbh->execute;
$self->{key_value} = $sth->fetchrow_hashref;
to get the same result. Once the object is constructed, the values of the given fields can be taken from this key_value hash reference by using methods given in Owner.pm and Car.pm modules shown in Listing-2. This is how
$owner->first_name and $owner->last_name
returns the data. Notice also that we used the line
use lib "./PM";
at the beginning of program. This is because we put our business modules in a directory called PM, Perl Module, under the directory where our test program is located. You should give the correct path for your modules in order for your program to access the modules you define.
There are many advantages of accessing a database by using business modules. First, the code is simple; there is no SQL statement. Second, the syntax is simple, you know which data element of which table you are accessing. Code becomes almost standard, instead of accessing the database using the following statements
my $sql = "SELECT first_name, last_name FROM owner WHERE owner_id = $owner_id"; my $sth = $dbh->prepare($sql); $sth->execute; my @row = $sth->fetchrow; print $row[0]; print $row[1];
and accessing data elements as $row[0] and $row[1], you always access it as
$owner->first_name; $owner->last_name;
provided that you called your object name with the corresponding table name. There is less possibility that you will make errors. The objects have warning mechanisms, if you try to access a wrong field name, it will warn you that the field you are trying to access does not exist in the database. This warning mechanism is implemented in AUTOLOAD which will be described later.
At the beginning we said that the codes we generated must be easy to maintain. Does what we do to access a database by using business objects satisfy this criteria? Of course not. Even though, for the user, this is simple way of accessing the data, it is nightmare for the module developer. First of all, he or she should enter all the field names in the tables as methods in modules, as shown below for owner first_name field.
sub first_name { my $self = shift; return $self->{key_value}->{FIRST_NAME}};
For large databases we have to create hundreds of method to access the data stored in the objects. We could write simple code generators, which will do this automatically, however that will have its own set of problems. Secondly, any change in tables requires changes in modules. Adding a new field in a table requires a method addition to the modules. This is a maintenance problem and not acceptable in practice.
THE SECOND ITERATION
Here comes AUTOLOAD to the rescue. For readers who don’t know what AUTOLOAD is and what it does, here is a brief explanation from the "Perl Cookbook".
If you call an undefined method, instead of automatically raising an exception, you can trap the call. If the method’s package has a subroutine named AUTOLOAD, then this method is called in its place, with the special package global $AUTOLOAD set to the fully qualified method name. The AUTOLOAD subroutine can then do whatever that method would do.
In plain English, if you call first_name method of the Owner module, and that method does not exist, the AUTOLOAD subroutine of that module, Owner.pm, will be called. If that module does not have an AUTOLOAD method, then its superclass, Business.pm, ATUOLOAD will be called. How useful can this be? Very. Imagine all the methods which accesses data elements in Owner.pm and Car.pm being removed. In this case our program given above will not work. Because these methods do not exist in any of our modules, including Business.pm. Since AUTOLOAD might trap these method calls, we can put AUTOLOAD methods into the Owner.pm and Car.pm modules. This is not acceptable because large databases contain many tables and there would be a lot of code repetition. The best place would be to put AUTOLOAD into superclass Business.pm. Having said that, we could rewrite our Business.pm, Owner.pm, and Car.pm modules one more time. This time we are going to put the AUTOLOAD method into our Business.pm module and delete field methods from the Owner.pm and Car.pm modules. Codes for these modules are given in Listing-3. As you can see there are no methods in Owner.pm and there is only one method in Car.pm (called Owner) and we will talk about that later. Now AUTOLOAD, defined in Business.pm module, handles all those methods. As far as users of our module are concerned nothing has changed. They can still use their programs. But life becomes much simpler for the person who maintains the modules. If a new field has been added to a table, there is no code change in the modules. When a field is removed from a table, all the programs that use this field will issue a warning. When a new table is added to the database, we will create a new module with that table name and inherit it from Business.pm, which is only three lines of code.
Is there a problem with these modules? In fact there is, namely efficiency, since Perl does extra work for a method name that does not exist. First it looks in the AUTOLOAD method of that module, and if it is not there, than it looks to the super-class of that module and so on. By comparison, the first iteration is more efficient than the second, however, it has maintenance problems. What can we do so that we will have an efficient solution and a maintainable code at the same time?
THE THIRD ITERATION
Luckily, Perl provides direct run-time access to each module’s symbol table. This enables us to extend the AUTOLOAD method given in Business.pm module. If we add the following line to our AUTOLOAD method, as shown in Listing-4
*{$AUTOLOAD} = sub {my $self = shift; return $self->{key_value}->{$key}};
the efficiency will be increased substantially. This line registers the given methods to the module's symbol table. Because of this, the AUTOLOAD method will be used the first time to access to the given method. The second time the subroutine defined above will be called directly, without additional lookup. As Damian Conway, the author of the book "Object Oriented PERL", stated, this line teaches its module a new method whenever a new unknown method is called. I strongly suggest anyone who wants to use Perl in an object-oriented way, to buy and read this book. Page 93 discusses "Reducing the cost of autoloading". With the addition of a simple line, the code becomes as efficient as it defined in the first iteration, which uses hard coded methods to access data elements.
EXPANDING BUSINESS MODULES
Remember, we had said that we would talk about the Owner method described in Car.pm.
Here is a brief explanation for when and where we need methods like this.
If there is one to many relationships between tables, as it is the case between the Owner and the Car tables, you should put a foreign key into many tables, which point to one’s table.
In our case the Car table contains a pointer into the Owner table (owner_id).
Therefore, we have to be able to create an owner object for a given car object.
What the Owner method given in Car.pm does is to create an owner object.
These methods are named with the Module name of the type of the object they are returning.
When you see the Owner method you realize that this is a method which returns an owner object.
For given car objects using the Owner method we can find the owner of the cars easily.
Here is a simple program giving information about a car and its owner.
use lib "./PM"; use Business; use Car; use Owner; my $car = Car->new(2); my $owner = $car->Owner; print $car->make,"\n"; print $car->model,"\n"; print $owner->first_name," ",$owner->last_name,"\n";
As can be seen from this program, we have not written any SQL statements, but we got all the information about the owner of a car. The Owner method uses the getObject method of the Business.pm module, which in turn creates a new object by using the known owner_id in the $car object. Here is a word of caution; whenever a new object is created, the information is extracted from the database. You might be tempted to get the owner’s name as
print $car->Owner->first_name," ",$car->Owner->last_name,"\n";
You can do this and get the same result, but it is not advisable. The reasoning is, as follows: in this case you are running the SQL statement twice for each Owner method, while in the original case you are doing it once. As is done in the sample program, you should create the owner object once and use it whenever there is a need for it.
Methods similar to the Owner method described here are used to find the "one" in one to many relations. There are many car objects for a given owner object. If we have a car object, we can find its owner object by using the Owner method. How can we find the car objects for a given owner object? This can be done by using the objects method, given in Listing-5, of the Business.pm module. This method returns a single object or an array of objects depending on the context where it is used.
Assuming that we have the $owner object, we can find all the cars belong to this owner as,
my @cars = Car->objects(owner_id => $owner->ID);
Where the @cars array contains all the car objects. In our design, the ID method always returns the primary key of a given object. The "objects" method is much more useful than this. For example the following code will return all the cars objects from the database.
my @cars = Car->objects;
While
my @cars = Car->objects(make => ‘Pontiac’, year => ‘2000’);
returns the entire array of car objects made by Pontiac and manufactured in the year 2000.
Here is a simple program that prints the names of owners of all Pontiacs made in 2000.
use lib "./PM";
use Business;
use Car;
use Owner;
my @cars = Car->objects(make => ‘Pontiac’, year => ‘2000’);
foreach my $car = ( @cars ) {
my $owner = $car->Owner;
print $owner->first_name, " ", $owner->last_name,"\n";
}
If your database contains as many table as ours, it becomes tedious to write use phrases for all the Modules. What we did was to put all these use statements into an Include.pm module as shown in Listing-6. Now the program given above can be written as shown below.
use lib "./PM";
use Include;
my @cars = Car->objects(make => ‘Pontiac’, year => ‘2000’);
foreach my $car = ( @cars ) {
my $owner = $car->Owner;
print $owner->first_name, " ", $owner->last_name,"\n";
}
CONCLUSION
Our applications are real world applications and by combining business modules with the Table and FormElement objects we eliminated most of the SQL statements and almost all HTML tags from our scripts. The business modules we are using are a bit more complicated than what we represented here. However, what is represented here will be useful in many cases. Our programs are now simpler and easier to maintain. Is it efficient? It’s true, we lost a fraction of a second here and there but it is worth it for speed of development and maintenance.
We used AUTOLOAD for creating FormElement objects. How and why we did it will be the topic of the next article.
I don’t know whether other languages have mechanisms similar to AUTOLOAD. We have benefited from it and I hope you will find some use for this wonderful subroutine of Perl.
Listing - 1 OWNER table owner_id NUMBER(10) NOT NULL, first_name VARCHAR2(20), last_name VARCHAR2(20), phone VARCHAR2(12) CAR table car_id NUMBER(10) NOT NULL, owner_id NUMBER(10) NOT NULL, make VARCHAR2(20), model VARCHAR2(20), year VARCHAR2(12), purchase_date DATE
Listing - 2
package Business;
use strict;
sub new {
my $package = shift;
my $self = {};
bless $self,ref $package || $package;
my $id = shift;
my $name = ref($self);
my $table_id = "\U$name"."_ID";
# Query object is used to access database, it uses DBI module
my $sql = "SELECT * FROM $name WHERE $table_id = '$id'";
my $qobj = Query->new_hash($sql);
# key_value is a hash reference containing key value pairs,
# keys are table field names
# values comes from database
$self->{key_value} = $qobj->{key_value};
# Query object is not defined here,
# instead you can use DBI fetchrow_hashref method
# you should provide $dbh and prepare the statement
# $dbh-prepare($sql);
# my $sth = $dbh-execute;
# $self->{key_value} = $sth->fetchrow_hashref;
$self->{ID} = $id;
return $self;
}
sub getObject
{
my $self = shift;
my $name = shift;
my $table_id = $name."_id";
my $id = $self->$table_id();
$id = 0 if not $id;
return $name->new($id);
}
1;
-----------------------
package Owner;
@ISA = ("Business");
sub owner_id { my $self = shift; return $self->{key_value}->{OWNER_ID}};
sub first_name { my $self = shift; return $self->{key_value}->{FIRST_NAME}};
sub last_name { my $self = shift; return $self->{key_value}->{LAST_NAME}};
sub phone { my $self = shift; return $self->{key_value}->{PHONE}};
1;
-----------------------
package Car;
@ISA = ("Business");
sub car_id { my $self = shift; return $self->{key_value}->{CAR_ID}};
sub owner_id { my $self = shift; return $self->{key_value}->{OWNER_ID}};
sub make { my $self = shift; return $self->{key_value}->{MAKE}};
sub model { my $self = shift; return $self->{key_value}->{MODEL}};
sub year { my $self = shift; return $self->{key_value}->{YEAR}};
sub prc_date { my $self = shift; return $self->{key_value}->{PURCHASE_DATE}};
sub Owner { my $self = shift; $self->getObject("Owner"); }
1;
Listing - 3
package Business;
use strict;
sub new {
my $package = shift;
my $self = {};
bless $self,ref $package || $package;
my $id = shift;
my $name = ref($self);
my $table_id = "\U$name"."_ID";
# Query object is used to access database, it uses DBI module
my $qobj = Query->new_hash("select * from $name where $table_id = '$id'");
# key_value is a hash containing key value pairs, keys are table field names
# values comes from databae
$self->{key_value} = $qobj->{key_value};
$self->{ID} = $id;
return $self;
}
sub AUTOLOAD {
my $self = shift;
my $attr = $Business::AUTOLOAD;
$attr =~ s/.*:://;
my $uppercase = "\U$attr";
return if $attr eq 'DESTROY';
if(exists($self->{key_value}->{$uppercase})) {
return $self->{key_value}->{$uppercase};
} else {
if($self->{ID} == 0) {
return $self->{na_string};
} else {
# warn the user for methods which do not exists
my ($package, $filename, $line) = caller(0);
print ref($self);
print ": Field $uppercase is not known in $filename at line $line \n";
}
}
}
sub getObject
{
my $self = shift;
my $name = shift;
my $table_id = $name."_id";
my $id = $self->$table_id();
$id = 0 if not $id;
return $name->new($id);
}
1;
package Owner;
@ISA = ("Business");
1;
package Car;
@ISA = ("Business");
sub Owner { my $self = shift; $self->getObject("Owner"); }
1;
Listing - 4
sub AUTOLOAD {
my $self = shift;
my $attr = $Business::AUTOLOAD;
$attr =~ s/.*:://;
my $uppercase = "\U$attr";
return if $attr eq 'DESTROY';
if(exists($self->{key_value}->{$uppercase})) {
# here method is registred to the symbol table
# to access same method faster second time
# Look at Object Oriented Perl by Damian Conway pp 95
*{AUTOLOAD} = sub { my $self = shift; return $self->{key_value}->{$uppercase} };
return $self->{key_value}->{$uppercase};
} else {
if($self->{ID} == 0) {
return $self->{na_string};
} else {
# warn the user for methods which do not exists
my ($package, $filename, $line) = caller(0);
print ref($self);
print ": Field $uppercase is not known in $filename at line $line \n";
}
}
}
Listing - 5
# this method is a special constractor which returns array of objects
sub objects {
my $P = shift;
my %hsh = ( @_ );
my $name = $P;
my $table_id = "\l$name"."_id";
my $sql = "
SELECT *
FROM $name
";
# if user suplies and sql use it
if( @_ ) {
if($hsh{sql}) {
$sql = $hsh{sql};
} else {
if($hsh{where}) {
$sql .= " WHERE ".$hsh{where};
} else {
$sql .= " WHERE ";
my @temp;
my @ttt = keys %hsh;
foreach $key ( @ttt ) {
if($hsh{order_by}) { next; }
push(@temp," $key = '$hsh{$key}' ");
}
$sql .= join("AND ", @temp);
}
if($hsh{order_by}) {
$sql .= "
ORDER BY $hsh{order_by}
";
}
}
}
my $dbh = Query->Dbh;
my $sth = $dbh->prepare($sql);
$sth->execute;
my @objs;
my $hash_ref;
my $i;
while($hash_ref = $sth->fetchrow_hashref) {
$objs[$i] = $name->init($hash_ref);
$i++;
}
if(wantarray) {
return @objs;
} else {
return $objs[0];
}
Listing - 6 Include.pm module use Business; use Owner; use Car;