# ----- Example 7 - Spider using "http" method ------- # # Please see the swish-e documentation for # information on configuration directives. # Documentation is included with the swish-e # distribution, and also can be found on-line # at http://swish-e.org # # # This example demonstrates how to use the # the "http" method of spidering. # # Indexing (spidering) is started with the following # command issued from the "conf" directory: # # swish-e -S http -c example7.config # # Note: You should have the current Bundle::LWP bundle # of perl modules installed. This was tested with: # libwww-perl-5.53 # # ** Do not spider a web server without permission ** # #--------------------------------------------------- # Include our site-wide configuration settings: IncludeConfigFile example4.config # Specify the URL (or URLs) to index: IndexDir http://swish-e.org # If a server goes by more than one name you can use this directive: EquivalentServer http://swish-e.org http://www.swish-e.org # This defines how many links the spider should # follow before stopping. A value of 0 configures the spider to # traverse all links. The default is 5 # The idea is to limit spidering, but seems of questionable use # since depth may not be related to anything useful. MaxDepth 10 # The number of seconds to wait between issuing # requests to a server. The default is 60 seconds. Delay 1 # (default /var/tmp) The location of a writeable temp directory # on your system. The HTTP access method tells the Perl helper to place # its files there. The default is defined in src/config.h and depends on # the current OS. TmpDir . # The "http" method uses a perl helper program to fetch each document # from the web called "swishspider" and is included in the src directory of # the swish-e distribution. SpiderDirectory ../src # end of example