home | O'Reilly's CD bookshelfs | FreeBSD | Linux | Cisco | Cisco Exam  


Writing Apache Modules with Perl and C
By:   Lincoln Stein and Doug MacEachern
Published:   O'Reilly & Associates, Inc.  - March 1999

Copyright © 1999 by O'Reilly & Associates, Inc.


 


   Show Contents   Previous Page   Next Page

Chapter 7 - Other Request Phases / The URI Translation Phase
Using a Translation Handler to Change the URI

Instead of completely translating a URI into a filename, a translation handler can modify the URI itself and let other handlers do the work of completing the translation into a physical path. This is very useful because it allows the handler to interoperate with other URI translation directives such as Alias and UserDir.

To change the URI, your translation handler should set it with the Apache request object's uri() method instead of (or in addition to) the filename() method $r->uri($new_uri);.

After changing the URI, your handler should then return DECLINED, not OK. This may seem counter-intuitive. However, by returning DECLINED, your translation handler is telling Apache that it has declined to do the actual work of matching the URI to a filename and is asking Apache to pass the modified request on to other registered translation handlers.

Example 7-3 shows a reworked version of the checksum translation handler that alters the URI rather than sets the filename directly. The code is nearly identical to the first version of this module, but instead of retrieving a physical directory path from a PerlSetVar configuration variable named ChecksumDir, the handler looks for a variable named ChecksumPath which is expected to contain the virtual (URI space) directory in which the checksums can be found. If the variable isn't defined, then /checksums is assumed. We perform the string substitution on the requested URI as before. If the substitution succeeds, we write the modified URI back into the request record by calling the request object's uri() method. We then return DECLINED so that Apache will pass the altered request on to other translation handlers.

Example 7-3. A Translation Handler That Changes the URI

package Apache::Checksum2;
# file: Apache/Checksum2.pm
use strict;
use Apache::Constants qw(:common);
use constant DEFAULT_CHECKSUM_PATH => '/checksums';
sub handler {
  my $r = shift;
  my $uri = $r->uri;
  my $cksumpath = $r->dir_config('ChecksumPath') || DEFAULT_CHECKSUM_PATH;
  return DECLINED unless $uri =~ s!^(.+)\.cksm$!$cksumpath$1!;
  $r->uri($uri);
  return DECLINED;
}
1;
__END__

The configuration file entries needed to work with Apache::Checksum2 are shown below. Instead of passing the translation handler a physical path in the ChecksumDir variable, we use ChecksumPath to pass a virtual URI path. The actual translation from a URI to a physical path is done by the standard mod_alias module from information provided by an Alias directive. Another point to notice is that because the translation handler changed the URI, we can now use a <Location> section to force the type of the checksum files to text/plain.

PerlTransHandler  Apache::Checksum2
PerlSetVar        ChecksumPath /checksums
Alias             /checksums/ /home/www/checksums/
<Location /checksums>
 ForceType text/plain
</Location>

In addition to interoperating well with other translation directives, this version of the checksum translation handler deals correctly with the implicit retrieval of index.html files when the URI ends in a directory name. For example, retrieving the partial URI /archive/software/.cksm will be correctly transformed into a request for /home/httpd/checksums/archive/software/index.html.

On the downside, this version of the translation module may issue potentially confusing error messages if a checksum file is missing. For example, if the user requests URI /archive/software/index.html.cksm and the checksum file is not present, Apache's default "Not Found" error message will read, "The requested URL /checksums/archive/software/index.html was not found on this server." The user may be confused to see an error message that refers to a URI other than the one he requested.

Another example of altering the URI on the fly can be found in Chapter 5, where we used a translation handler to manage session IDs embedded in URIs. This handler copies the session ID from the URI into an environment variable for later use by the content handler, then strips the session ID from the URI and writes it back into the request record.

Installing a Custom Response Handler in the URI Translation Phase

   Show Contents   Go to Top   Previous Page   Next Page

In addition to its official use as the place to modify the URI and filename of the requested document, the translation phase is also a convenient place to set up custom content handlers for particular URIs. To continue with our checksum example, let's generate the checksum from the requested file on the fly rather than using a precomputed value. This eliminates the need to maintain a parallel directory of checksum files but adds the cost of additional CPU cycles every time a checksum is requested.

Example 7-4 shows Apache::Checksum3. It's a little longer than the previous examples, so we'll step through it a chunk at a time.

package Apache::Checksum3;
# file: Apache/Checksum3.pm
use strict;
use Apache::Constants qw(:common);
use Apache::File ();
use MD5 ();
my $MD5 = MD5->new;

Because this module is going to produce the MD5 checksum itself, we bring in the Apache::File and MD5 modules. We then create a file-scoped lexical MD5 object that will be used within the package to generate the MD5 checksums of requested files.

sub handler {
   my $r = shift;
   my $uri = $r->uri;
   return DECLINED unless $uri =~ s/\.cksm$//;
   $r->uri($uri);

We define two subroutines. The first, named handler(), is responsible for the translation phase of the request. Like its predecessors, this subroutine recovers the URI from the request object and looks for the telltale .cksm extension. However, instead of constructing a new path that points into the checksums directory, we simply strip off the extension and write the modified path back into the request record.

    $r->handler("perl-script");
   $r->push_handlers(PerlHandler => \&checksum_handler);
   return DECLINED;
}

Now the interesting part begins. We set the request's content handler to point to the second subroutine in the module, checksum_handler(). This is done in two phases. First we call $r->handler("perl-script") to tell Apache to invoke the Perl interpreter for the content phase of the request. Next we call push_handlers() to tell Perl to call our checksum_handler() method when the time comes. Together, these routines have the same effect as the configuration directives SetHandler and PerlHandler. Our work done, we return a result code of DECLINED in order to let the other translation handlers do their job.

Apache will now proceed as usual through the authorization, authentication, MIME type checking, and fixup phases until it gets to the content phase, at which point Apache::Checksum3 will be reentered through the checksum_handler() routine:

sub checksum_handler {
   my $r = shift;
   my $file = $r->filename;
   my $sfile = Apache::File->new($file) || return DECLINED;
   $r->content_type('text/plain');
   $r->send_http_header;
   return OK if $r->header_only;
   $MD5->reset;
   $MD5->addfile($sfile);
   $r->print($MD5->hexdigest(),"\t",$r->uri,"\n");
   return OK;
}

Like the various content handlers we saw in Chapter 4, checksum_handler() calls the request object's filename() method to retrieve the physical filename and attempts to open it, returning DECLINED in case of an error. The subroutine sets the content type to text/plain and sends the HTTP header. If this is a HEAD request, we return. Otherwise, we invoke the MD5 module's reset() method to clear the checksum algorithm, call addfile() to process the contents of the file, and then hexdigest() to emit the checksum.

Because this module is entirely self-contained, it has the simplest configuration of them all:

PerlTransHandler Apache::Checksum3

Like other PerlTransHandler directives, this one must be located in the main part of the configuration file or in a virtual host section.

Example 7-4. Calculating Checksums on the Fly

package Apache::Checksum3;
# file: Apache/Checksum3.pm
use strict;
use Apache::Constants qw(:common);
use Apache::File ();
use MD5 ();
my $MD5 = MD5->new;
sub handler {
   my $r = shift;
   my $uri = $r->uri;
   return DECLINED unless $uri =~ s/\.cksm$//;
   $r->uri($uri);
   $r->handler("perl-script");
   $r->push_handlers(PerlHandler => \&checksum_handler);
   return DECLINED;
}
sub checksum_handler {
   my $r = shift;
   my $file = $r->filename;
   my $sfile = Apache::File->new($file) || return DECLINED;
   $r->content_type('text/plain');
   $r->send_http_header;
   return OK if $r->header_only;
   $MD5->reset;
   $MD5->addfile($sfile);
   $r->print($MD5->hexdigest(),"\t",$r->uri,"\n");
   return OK;
}
1;
__END__

Don't think that you must always write a custom translation handler in order to gain control over the URI translation phase. The powerful mod_rewrite module gives you great power to customize this phase. For example, by adding a mod_rewrite RewriteRule directive, you can define a substitution rule that transforms requests for .cksm URIs into requests for files in the checksum directory, doing in a single line what our first example of a translation handler did in 17.

   Show Contents   Go to Top   Previous Page   Next Page
Copyright © 1999 by O'Reilly & Associates, Inc.