Sophie

Sophie

distrib > Mandriva > 2010.0 > i586 > media > contrib-release > by-pkgid > eef56d5d1b7972ef4a2c51ea5be6f6b4 > files > 7

apache-mod_replace-0.1.0-10mdv2010.0.i586.rpm

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/strict.dtd">
<html lang="en">
<head>
 <title>mod_replace: Documentation</title>
 <meta name="description" content="mod_replace documentation">
 <meta name="keywords" content="mod_replace docs documentation">
 <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
</head>
<body background="#ffffff">
<h1>mod_replace: Documentation</h1>

<h2>What is <i>mod_replace</i>?</h2>

<p>
<i>mod_replace</i> is a simple Apache 2.0.x filter module which has originally
been developed based on <i>mod_ext_filter</i>. The initial purpose has been to 
support an Apache-based reverse proxy with <i>mod_rewrite</i>. Absolute URLs 
contained in the HTTP body could not be handled with <i>mod_rewrite</i>. Thus 
there was a slow <i>mod_perl</i> solution to rewrite the body content.
</p>

<p>
The C-based <i>mod_replace</i> in its original version provided a much faster 
approach to this problem. Since then HTTP header replacement (eg. for Cookie
adjustments) have been added.
</i>

<p>
Until now, <i>mod_replace</i> is only used in addition to <i>mod_proxy</i> to 
provide an improved reverse proxy experience. It greatly helps to sanitize ill 
behaving web servers / applications. Examples: absolute links within web pages, 
absolute links in HTTP headers which aren't controlled by <i>mod_proxy</i> 
(eg. Set-Cookie).
</p>

<h2>How does it work?</h2>

<p>
There are up to date three destinct mechanisms to do pattern replacement within
mod_replace. Those are:

<ul>
 <li>HTTP response body replacements
  <ul>
   <li>The most powerful replacements, since there's support for
   subpatterns</li>
   <li>most commonly used feature of <i>mod_replace</i></li>
   <li>based upon Apache's filter mechanism (Output Filter)</li>
  </ul>
 </li>
 <li>HTTP response header replacements
  <ul>
   <li>Suitable for rewriting HTTP header information of server responses</li>
   <li>No support for subpatterns (still to come)</li>
   <li>based upon Apache's filter mechanism (Output Filter)</li>
  </ul>
 </li>
 <li>HTTP request header replacements
  <ul>
   <li>Suitable for rewriting HTTP header information of client requests before
       they reach the server (eg. on a reverse proxy)</li>
   <li><b>NOT</b> based upon Apache's filter mechanism</li>
  </ul>
 </li>
</ul>
</p>

<p>
When a HTTP response is routed through an HTTP body filter, there are a couple
of things you should be aware of:

<ul>
 <li>The filter has to assemble the whole response before it starts processing
     the content. Apache uses buckets to store parts of the data. If you 
     process only those buckets, you certainly get into trouble when a pattern 
     extends over the edges of those buckets. Thus you have to concatenate all 
     buckets to a single data structure. This takes time and resources!</li>
 <li>Multiple patterns for the same filter definition are concatenated using a
     linked list. This means that the patterns are processed sequentially. The
     pattern first defined will be the pattern first processed by the filter.
     Once this run is complete, the next pattern processes the already altered
     data. This means, if you define multiple patterns, each page has to be
     processed multiple times. Try to solve you problem with as few patterns as
     possible.</li>
 <li>Once all patterns are processed the data is passed to the next filter. If
     this is another HTTP body filter, the whole story is repeated.</li>
</ul>
</p>

<p>
This means, that you can define multiple patterns for a single filter
definition. You simply create a new <i>ReplacePattern</i> with the same name as
the previous one (see examples below).
</p>

<p>
Using an HTTP response header filter, the process is almost the same as above.
The patterns are sequentially matched against the data and the necessary
replacements take place before the next pattern is processed.
</p>

<p>
One special feature to note with this filter is, that is doesn't stop looking
for matching headers once it found one (which would make sense, according to
the HTTP standard, there is only one occurrence of an HTTP header per
response). There is one commonly used situation where there are multiple
occurrences of the same HTTP header, each with different content: Set-Cookie.
</p>

<p>
The HTTP request header filter is quite different from the other filters,
because it doesn't use the same mechanism within Apache. If it would use the
filter mechanism (eg. as an input filter), any request that is also routed
through <i>mod_proxy</i> (using Apache as a reverse proxy) will first be 
processed by <i>mod_proxy</i> and then by <i>mod_replace</i>. Any modifications
applied to the HTTP header then are completely ignored by <i>mod_proxy</i>, 
since it already has created the request to the origin server and the 
modifications by <i>mod_replace</i> are simply discarded.
</p>

<p>
The mechanism used by <i>mod_replace</i> for modifications of the request 
header allow you to alter the HTTP header before mod_proxy processes the 
request. The same rules apply for the patterns: Multiple patterns are linked 
together in a linked list and are processed sequentially. Note: There is only 
one "filter" for all patterns. You don't need to create a named definition 
and you don't have to set the output filter. But you won't be able to specify 
additional parameters.
</p>

<h2>Configuration</h2>

<h3>Configuring an HTTP body filter</h3>

<h4>Syntax</h4>

<p>
<tt>ReplaceFilterDefine &lt;name&gt; [&lt;options&gt; ...]</tt>
</p>
<table border=1>
 <tr>
  <th>Option</th>
  <th>Description</th>
 </tr>
 <tr>
  <td><tt>&lt;name&gt;</tt></td>
  <td>The name of the filter definition. Used to destinguish multiple filters
      (not patterns) and to selectively actived filters.</td>
 </tr>
 <tr>
  <td><tt>&lt;options&gt;</tt></td>
  <td>Configuration options for this filter definition.<br><br>
      <table>
       <tr>
        <td><tt>CaseIgnore</tt></td>
        <td>Pattern matching is case insensitive. Don't set this option if you 
            want your patterns to be matched case sensitive!</td>
       </tr>
      <tr>
       <td><tt>intype=&lt;mime&gt;</tt></td>
       <td>Narrows the pattern matching to HTTP responses with the specified 
           MIME type (eg. text/html). Be careful if you use this option with
           HTTP header patterns.</td>
      </tr>
     </table></td>
 </tr>
</table>

<p>
<tt>ReplacePattern &lt;name&gt; &lt;pattern&gt; &lt;string&gt;</tt>
</p>

<table border=1>
 <tr>
  <th>Option</th>
  <th>Description</th>
 </tr>
 <tr>
  <td><tt>&lt;name&gt;</tt></td>
  <td>The name of the filter definition which this pattern is added to. Be sure
      to define a filter by using the ReplaceFilterDefine command.</td>
 </tr>
  <tr>
  <td><tt>&lt;pattern&gt;</tt></td>
  <td>A PCRE (perl compatible regular expression) pattern. This pattern is 
      matched against any the HTTP body coming from the server. You may use 
      subpatterns and reference them (up to 9) in the replacement string. See 
      the examples for more information.</td>
 </tr>
  <tr>
  <td><tt>&lt;string&gt;</tt></td>
  <td>The string that is inserted as an replacement if a pattern matches. You 
      may specify up to 9 subpatterns from the original pattern (\0 - \9). See
      the examples.</td>
 </tr>
</table>

<p>
<tt>SetOutputFilter &lt;name&gt;[;&lt;name&gt;]</tt>
</p>

<table border=1>
 <tr>
  <th>Option</th>
  <th>Description</th>
 </tr>
 <tr>
  <td><tt>&lt;name&gt;</tt></td>
  <td>The name of a filter definition that needs to be activated. If there are
      multiple definitions, you have to put semicolons between the names.</td>
 </tr>
</table>

<h4>Examples</h4>

<pre>
  ReplaceFilterDefine revproxy CaseIgnore intype=text/html
  ReplacePattern revproxy "(http|https)://origin.server/" "\1://revproxy/"
  SetOutputFilter revproxy
</pre>

<pre>
  ReplaceFilterDefine multiple CaseIgnore intype=text/html
  ReplacePattern multiple "(http|https)://origin.server/" "\1://revproxy/"
  ReplacePattern multiple "ftp://origin.server" "ftp://public.server/pub"
  SetOutputFilter multiple
</pre>

<h3>Configuring an HTTP header filter</h3>

<h4>Syntax</h4>

<pre>
ReplaceFilterDefine &lt;name&gt; [&lt;options&gt; ...]
</pre>

<table border=1>
 <tr>
  <th>Option</th>
  <th>Description</th>
 </tr>
 <tr>
  <td><tt>&lt;name&gt;</tt></td>
  <td>The name of the filter definition. Used to destinguish multiple filters 
      (not patterns) and to selectively actived filters.</td>
 </tr>
 <tr>
  <td><tt>&lt;options&gt;</tt></td>
  <td>Configuration options for this filter definition.
      <table>
       <tr>
        <td><tt>CaseIgnore</tt></td>
        <td>Pattern matching is case insensitive. Don't set this option if you
            want your patterns to be matched case sensitive!</td>
       </tr>
       <tr>
        <td><tt>intype=&lt;mime&gt;</tt></td>
        <td>Narrows the pattern matching to HTTP responses with the specified 
            MIME type (eg. text/html). Be careful if you use this option with 
            HTTP header pattern.</td>
       </tr>
      </table></td>
 </tr>
</table>

<pre>
HeaderReplacePattern &lt;name&gt; &lt;header&gt; &lt;pattern&gt; &lt;string&gt;
</pre>

<table border=1>
 <tr>
  <th>Option</th>
  <th>Description</th>
 </tr>
 <tr>
  <td><tt>&lt;name&gt;</tt></td>
  <td>The name of the filter definition. Used to destinguish multiple filters
      (not patterns) and to selectively actived filters.</td>
 </tr>
 <tr>
  <td><tt>&lt;header&gt;</tt></td>
  <td>This is the HTTP header that is to be altered. Note: you cannot alter the
      header field, only its content. Eg. you can alter the domain name of a 
      Set-Cookie header, but not change an - obviously wrong - "SetKookie" to 
      "Set-Cookie".</td>
 </tr>
 <tr>
  <td><tt>&lt;pattern&gt;</tt></td>
  <td>A PCRE (perl compatible regular expression) pattern. This pattern is 
      matched against the HTTP body coming from the server. You can use 
      subpatterns here, but you are not able to reference them in the 
      replacement string (not implemented).</td>
 </tr>
 <tr>
  <td><tt>&lt;string&gt;</tt></td>
  <td>The string that is inserted as an replacement if a pattern matches.</td>
 </tr>
</table>

<pre>
SetOutputFilter &lt;name&gt;[;&lt;name&gt;]
</pre>

<table border=1>
 <tr>
  <th>Option</th>
  <th>Description</th>
 </tr>
 <tr>
  <td><tt>&lt;name&gt;</tt></td>
  <td>The name of a filter definition that needs to be activated. If there are
      multiple definitions, you have to put semicolons between the names.</td>
 </tr>
</table>

<h4>Examples</h4>

<pre>
  ReplaceFilterDefine revproxy CaseIgnore
  HeaderReplacePattern revproxy Set-Cookie \
    " domain=[.]?server.com" \
    " domain=revproxy.com"
  SetOutputFilter revproxy
</pre>

<table bgcolor="#e0e0e0">
 <tr>
  <th>HTTP header before OutputFilter</th>
  <th>HTTP header after OutputFilter</th>
 </tr>
 <tr>
  <td>
   <pre>
    Date: Wed, 07 Apr 2004 13:08:01 GMT
    Server: Apache/1.3.29
    Vary: Accept-Encoding,User-agent
    Set-Cookie: UID=0815; domain=server.com; path=/
    Connection: close
    Content-Type: text/html; charset=iso-8859-1
   </pre>
  </td>
  <td>
   <pre>
    Date: Wed, 07 Apr 2004 13:08:01 GMT
    Server: Apache/1.3.29
    Vary: Accept-Encoding,User-agent
    Set-Cookie: UID=0815; domain=revproxy.com; path=/
    Connection: close
    Content-Type: text/html; charset=iso-8859-1
   </pre>
  </td>
 </tr>
<table> 

<h3>Configuring an HTTP request header filter</h3>

<h4>Syntax</h4>

<pre>
RequestHeaderPattern &lt;header&gt; &lt;pattern&gt; &lt;string&gt;
</pre>

<table border=1>
 <tr>
  <th>Option</th>
  <th>Description</th>
 </tr>
 <tr>
  <td><tt>&lt;header&gt;</tt></td>
  <td>This is the HTTP header that is to be altered. Note: you cannot alter the
      header field, only its content. Eg. you can alter the domain name of a 
      Set-Cookie header, but not change an - obviously wrong - "SetKookie" to 
      "Set-Cookie".</td>
 </tr>
 <tr>
  <td><tt>&lt;pattern&gt;</tt></td>
  <td>A PCRE (perl compatible regular expression) pattern. This pattern is 
      matched against the HTTP body coming from the server. You can use 
      subpatterns here, but you are not able to reference them in the 
      replacement string (not implemented).</td>
 </tr>
 <tr>
  <td><tt>&lt;string&gt;</tt></td>
  <td>The string that is inserted as an replacement if a pattern matches.</td>
 </tr>
</table>

<h4>Examples</h4>

<pre>
  RequestHeaderPattern Cookie " UID=0815" " UID=007"
</pre>

</body>
</html>