Sophie

Sophie

distrib > Mandriva > 2010.0 > i586 > media > contrib-release > by-pkgid > 1a3cbdc4584a84178934b536f4bcebeb > files > 7

perl-Apache-Filter-1.024-6mdv2010.0.noarch.rpm

NAME
    Apache::Filter - Alter the output of previous handlers

SYNOPSIS
      #### In httpd.conf:
      PerlModule Apache::Filter
      # That's it - this isn't a handler.
  
      <Files ~ "*\.blah">
       SetHandler perl-script
       PerlSetVar Filter On
       PerlHandler Filter1 Filter2 Filter3
      </Files>
  
      #### In Filter1, Filter2, and Filter3:
      $r = $r->filter_register();  # Required
      my $fh = $r->filter_input(); # Optional (you might not need the input FH)
      while (<$fh>) {
        s/ something / something else /;
        print;
      }
  
      #### or, alternatively:
      $r = $r->filter_register();
      my ($fh, $status) = $r->filter_input(); # Get status information
      return $status unless $status == OK;
      while (<$fh>) {
        s/ something / something else /;
        print;
      }

DESCRIPTION
    In basic operation, each of the handlers Filter1, Filter2, and Filter3 will
    make a call to $r->filter_input(), which will return a filehandle. For
    Filter1, the filehandle points to the requested file. For Filter2, the
    filehandle contains whatever Filter1 wrote to STDOUT. For Filter3, it
    contains whatever Filter2 wrote to STDOUT. The output of Filter3 goes
    directly to the browser.

    Note that the modules Filter1, Filter2, and Filter3 are listed in forward
    order, in contrast to the reverse-order listing of Apache::OutputChain.

    When you've got this module, you can use the same handler both as a
    stand-alone handler, and as an element in a chain. Just make sure that
    whenever you're chaining, all the handlers in the chain are "Filter-aware,"
    i.e. they each call $r->filter_register() exactly once, before they start
    printing to STDOUT. There should be almost no overhead for doing this when
    there's only one element in the chain.

    Currently the following public modules are Filter-aware. Please tell me of
    others you know about.

     Apache::Registry (using Apache::RegistryFilter, included here)
     Apache::SSI
     Apache::ASP
     HTML::Mason
     Apache::SimpleReplace
     Apache::HTML::ClassParser (part of HTML_Tree distribution)

METHODS
    Apache::Filter is a subclass of Apache, so all Apache methods are available.

    This module doesn't create an Apache handler class of its own - rather, it
    adds some methods to the Apache:: class. Thus, it's really a mix-in package
    that just adds functionality to the $r request object.

    * $r = $r->filter_register()
        Every Filter-aware module must call this method exactly once, so that
        Apache::filter can properly rotate its filters from previous handlers,
        and so it can know when the output should eventually go to the browser.

    * $r->filter_input()
        This method will give you a filehandle that contains either the file
        requested by the user ($r->filename), or the output of a previous
        filter. If called in a scalar context, that filehandle is all you'll get
        back. If called in a list context, you'll also get an Apache status code
        (OK, NOT_FOUND, or FORBIDDEN) that tells you whether $r->filename was
        successfully found and opened. If it was not, the filehandle returned
        will be undef.

    * $r->changed_since($time)
        Returns true or false based on whether the current input seems like it
        has changed since "$time". Currently the criteria to figure this out is
        this: if the file pointed to by "$r->finfo" hasn't changed since the
        time given, and if all previous filters in the chain are deterministic
        (see below), then we return false. Otherwise we return true.

        This method is meant to be useful in implementing caching schemes.

        A caution: always call the "changed_since()" and "deterministic()"
        methods AFTER the "filter_register()" method. This is because
        Apache::Filter uses a crude counting method to figure out which handler
        in the chain is currently executing, and calling these routines out of
        order messes up the counting.

    * $r->deterministic(1|0);
        As of version 0.07, the concept of a "deterministic" filter is
        supported. A deterministic filter is one whose output is entirely
        determined by the contents of its input file (whether the $r->filename
        file or the output of another filter), and doesn't depend at all on
        outside factors. For example, a filter that translates all its output to
        upper-case is deterministic, but a filter that adds a date stamp to a
        page, or looks things up in a database which may vary over time, is not.

        Why is this a big deal? Let's say you have the following setup:

         <Files ~ "\.boffo$">
          SetHandler perl-script
          PerlSetVar Filter On
          PerlHandler Apache::FormatNumbers Apache::DoBigCalculation
          # The above are fake modules, you get the idea
         </Files>

        Suppose the FormatNumbers module is deterministic, and the
        DoBigCalculation module takes a long time to run. The DoBigCalculation
        module can now cache its results, so that when an input file is
        unchanged on disk, its results will remain known when passed through the
        FormatNumbers module, and the DoBigCalculation module will be able to
        used cached results from a previous run.

        The guts of the modules would look something like this:

         sub Apache::FormatNumbers::handler {
            my $r = shift;
            $r->content_type("text/html");
            my ($fh, $status) = $r->filter_input();
            return $status unless $status == OK;
            $r->deterministic(1); # Set to true; default is false
    
            # ... do some formatting, print to STDOUT
            return OK;
         }
 
         sub Apache::DoBigCalculation::handler {
            my $r = shift;
            $r->content_type("text/html");
            my ($fh, $status) = $r->filter_input();
            return $status unless $status == OK;
    
            # This module implements a caching scheme by using the 
            # %cache_time and %cache_content hashes.
            my $time = $cache_time{$r->filename};
            my $output;
            if ($r->changed_since($time)) {
                # Read from <$fh>, perform a big calculation on it, and print to STDOUT
            } else {
                print $cache_content{$r->filename};
            }
    
            return OK;
         }

        A caution: always call the "changed_since()" and "deterministic()"
        methods AFTER the "filter_register()" method. This is because
        Apache::Filter uses a crude counting method to figure out which handler
        in the chain is currently executing, and calling these routines out of
        order messes up the counting.

HEADERS
    In previous releases of this module, it was dangerous to call
    $r->send_http_header(), because a previous/postvious filter might also try
    to send headers, and then you'd have duplicate headers getting sent. In
    current releases you can simply send the headers. If the current filter is
    the last filter, the headers will be sent as usual, and otherwise
    send_http_header() is a no-op.

NOTES
    You'll notice in the SYNOPSIS that I say ""PerlSetVar Filter On"". That
    information isn't actually used by this module, it's used by modules which
    are themselves filters (like Apache::SSI). I hereby suggest that filtering
    modules use this parameter, using it as the switch to detect whether they
    should call $r->filter_register. However, it's often not necessary - there
    is very little overhead in simply calling $r->filter_register even when you
    don't need to do any filtering, and $r->filter_input can be a handy way of
    opening the $r->filename file.

    VERY IMPORTANT: if one handler in a stacked handler chain uses
    "Apache::Filter", then THEY ALL MUST USE IT. This means they all must call
    $r->filter_register exactly once. Otherwise "Apache::Filter" couldn't
    capture the output of the handlers properly, and it wouldn't know when to
    release the output to the browser.

    The output of each filter (except the last) is accumulated in memory before
    it's passed to the next filter, so memory requirements are large for large
    pages. Apache::OutputChain only needs to keep one item from print()'s
    argument list in memory at a time, so it doesn't have this problem, but
    there are others (each chunk is filtered independently, so content spanning
    several chunks won't be properly parsed). In future versions I might find a
    way around this, or cache large pages to disk so memory requirements don't
    get out of hand. We'll see whether it's a problem.

    A couple examples of filters are provided with this distribution in the t/
    subdirectory: UC.pm converts all its input to upper-case, and Reverse.pm
    prints the lines of its input reversed.

    Finally, a caveat: in version 0.09 I started explicitly setting the
    Content-Length to undef. This prevents early filters from incorrectly
    setting the content length, which will almost certainly be wrong if there
    are any filters after it. This means that if you write any filters which set
    the content length, they should do it after the $r->filter_register call.

TO DO
    Add a buffered mode to the final output, so that we can send a proper
    Content-Length header. [gozer@hbesoftware.com (Philippe M. Chiasson)]

BUGS
    This uses some funny stuff to figure out when the currently executing
    handler is the last handler in the chain. As a result, code that manipulates
    the handler list at runtime (using push_handlers and the like) might produce
    mayhem. Poke around a bit in the code before you try anything. Let me know
    if you have a better idea.

    As of 0.07, Apache::Filter will automatically return DECLINED when
    $r->filename points to a directory. This is just because in most cases this
    is what you want to do (so that mod_dir can take care of the request), and
    because figuring out the "right" way to handle directories seems pretty
    tough - the right way would allow a directory indexing handler to be a
    filter, which isn't possible now. Also, you can't properly pass control to a
    non-mod_perl indexer like mod_autoindex. Suggestions are welcome.

    I haven't considered what will happen if you use this and you haven't turned
    on PERL_STACKED_HANDLERS. So don't do it.

AUTHOR
    Ken Williams (ken@forum.swarthmore.edu)

COPYRIGHT
    Copyright 1998,1999,2000 Ken Williams. All rights reserved.

    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.

SEE ALSO
    perl(1).