Sophie

Sophie

distrib > Mandriva > 2010.0 > i586 > media > contrib-release > by-pkgid > 70ec89744a04da80369b4702b2c37256 > files > 217

ghc-doc-6.10.4-1mdv2010.0.i586.rpm

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<!--Rendered using the Haskell Html Library v0.2-->
<HTML
><HEAD
><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8"
><TITLE
>Data.HashTable</TITLE
><LINK HREF="haddock.css" REL="stylesheet" TYPE="text/css"
><SCRIPT SRC="haddock-util.js" TYPE="text/javascript"
></SCRIPT
><SCRIPT TYPE="text/javascript"
>window.onload = function () {setSynopsis("mini_Data-HashTable.html")};</SCRIPT
></HEAD
><BODY
><TABLE CLASS="vanilla" CELLSPACING="0" CELLPADDING="0"
><TR
><TD CLASS="topbar"
><TABLE CLASS="vanilla" CELLSPACING="0" CELLPADDING="0"
><TR
><TD
><IMG SRC="haskell_icon.gif" WIDTH="16" HEIGHT="16" ALT=" "
></TD
><TD CLASS="title"
>base-4.1.0.0: Basic libraries</TD
><TD CLASS="topbut"
><A HREF="index.html"
>Contents</A
></TD
><TD CLASS="topbut"
><A HREF="doc-index.html"
>Index</A
></TD
></TR
></TABLE
></TD
></TR
><TR
><TD CLASS="modulebar"
><TABLE CLASS="vanilla" CELLSPACING="0" CELLPADDING="0"
><TR
><TD
><FONT SIZE="6"
>Data.HashTable</FONT
></TD
><TD ALIGN="right"
><TABLE CLASS="narrow" CELLSPACING="0" CELLPADDING="0"
><TR
><TD CLASS="infohead"
>Portability</TD
><TD CLASS="infoval"
>portable</TD
></TR
><TR
><TD CLASS="infohead"
>Stability</TD
><TD CLASS="infoval"
>provisional</TD
></TR
><TR
><TD CLASS="infohead"
>Maintainer</TD
><TD CLASS="infoval"
>libraries@haskell.org</TD
></TR
></TABLE
></TD
></TR
></TABLE
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD
><TABLE CLASS="vanilla" CELLSPACING="0" CELLPADDING="0"
><TR
><TD CLASS="section4"
><B
>Contents</B
></TD
></TR
><TR
><TD
><DL
><DT
><A HREF="#1"
>Basic hash table operations
</A
></DT
><DT
><A HREF="#2"
>Converting to and from lists
</A
></DT
><DT
><A HREF="#3"
>Hash functions
</A
></DT
><DT
><A HREF="#4"
>Diagnostics
</A
></DT
></DL
></TD
></TR
></TABLE
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="section1"
>Description</TD
></TR
><TR
><TD CLASS="doc"
>An implementation of extensible hash tables, as described in
 Per-Ake Larson, <EM
>Dynamic Hash Tables</EM
>, CACM 31(4), April 1988,
 pp. 446--457.  The implementation is also derived from the one
 in GHC's runtime system (<TT
>ghc/rts/Hash.{c,h}</TT
>).
</TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="section1"
>Synopsis</TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="body"
><TABLE CLASS="vanilla" CELLSPACING="0" CELLPADDING="0"
><TR
><TD CLASS="decl"
><SPAN CLASS="keyword"
>data</SPAN
>  <A HREF="#t%3AHashTable"
>HashTable</A
> key val</TD
></TR
><TR
><TD CLASS="s8"
></TD
></TR
><TR
><TD CLASS="decl"
><A HREF="#v%3Anew"
>new</A
> ::  (key -&gt; key -&gt; <A HREF="../ghc-prim/GHC-Bool.html#t%3ABool"
>Bool</A
>) -&gt; (key -&gt; <A HREF="Data-Int.html#t%3AInt32"
>Int32</A
>) -&gt; <A HREF="System-IO.html#t%3AIO"
>IO</A
> (<A HREF="Data-HashTable.html#t%3AHashTable"
>HashTable</A
> key val)</TD
></TR
><TR
><TD CLASS="s8"
></TD
></TR
><TR
><TD CLASS="decl"
><A HREF="#v%3Ainsert"
>insert</A
> ::  <A HREF="Data-HashTable.html#t%3AHashTable"
>HashTable</A
> key val -&gt; key -&gt; val -&gt; <A HREF="System-IO.html#t%3AIO"
>IO</A
> <A HREF="../ghc-prim/GHC-Unit.html#t%3A%28%29"
>()</A
></TD
></TR
><TR
><TD CLASS="s8"
></TD
></TR
><TR
><TD CLASS="decl"
><A HREF="#v%3Adelete"
>delete</A
> ::  <A HREF="Data-HashTable.html#t%3AHashTable"
>HashTable</A
> key val -&gt; key -&gt; <A HREF="System-IO.html#t%3AIO"
>IO</A
> <A HREF="../ghc-prim/GHC-Unit.html#t%3A%28%29"
>()</A
></TD
></TR
><TR
><TD CLASS="s8"
></TD
></TR
><TR
><TD CLASS="decl"
><A HREF="#v%3Alookup"
>lookup</A
> ::  <A HREF="Data-HashTable.html#t%3AHashTable"
>HashTable</A
> key val -&gt; key -&gt; <A HREF="System-IO.html#t%3AIO"
>IO</A
> (<A HREF="Data-Maybe.html#t%3AMaybe"
>Maybe</A
> val)</TD
></TR
><TR
><TD CLASS="s8"
></TD
></TR
><TR
><TD CLASS="decl"
><A HREF="#v%3Aupdate"
>update</A
> ::  <A HREF="Data-HashTable.html#t%3AHashTable"
>HashTable</A
> key val -&gt; key -&gt; val -&gt; <A HREF="System-IO.html#t%3AIO"
>IO</A
> <A HREF="../ghc-prim/GHC-Bool.html#t%3ABool"
>Bool</A
></TD
></TR
><TR
><TD CLASS="s8"
></TD
></TR
><TR
><TD CLASS="decl"
><A HREF="#v%3AfromList"
>fromList</A
> :: <A HREF="Data-Eq.html#t%3AEq"
>Eq</A
> key =&gt; (key -&gt; <A HREF="Data-Int.html#t%3AInt32"
>Int32</A
>) -&gt; [(key, val)] -&gt; <A HREF="System-IO.html#t%3AIO"
>IO</A
> (<A HREF="Data-HashTable.html#t%3AHashTable"
>HashTable</A
> key val)</TD
></TR
><TR
><TD CLASS="s8"
></TD
></TR
><TR
><TD CLASS="decl"
><A HREF="#v%3AtoList"
>toList</A
> ::  <A HREF="Data-HashTable.html#t%3AHashTable"
>HashTable</A
> key val -&gt; <A HREF="System-IO.html#t%3AIO"
>IO</A
> [(key, val)]</TD
></TR
><TR
><TD CLASS="s8"
></TD
></TR
><TR
><TD CLASS="decl"
><A HREF="#v%3AhashInt"
>hashInt</A
> :: <A HREF="../ghc-prim/GHC-Types.html#t%3AInt"
>Int</A
> -&gt; <A HREF="Data-Int.html#t%3AInt32"
>Int32</A
></TD
></TR
><TR
><TD CLASS="s8"
></TD
></TR
><TR
><TD CLASS="decl"
><A HREF="#v%3AhashString"
>hashString</A
> :: <A HREF="Data-Char.html#t%3AString"
>String</A
> -&gt; <A HREF="Data-Int.html#t%3AInt32"
>Int32</A
></TD
></TR
><TR
><TD CLASS="s8"
></TD
></TR
><TR
><TD CLASS="decl"
><A HREF="#v%3Aprime"
>prime</A
> :: <A HREF="Data-Int.html#t%3AInt32"
>Int32</A
></TD
></TR
><TR
><TD CLASS="s8"
></TD
></TR
><TR
><TD CLASS="decl"
><A HREF="#v%3AlongestChain"
>longestChain</A
> ::  <A HREF="Data-HashTable.html#t%3AHashTable"
>HashTable</A
> key val -&gt; <A HREF="System-IO.html#t%3AIO"
>IO</A
> [(key, val)]</TD
></TR
></TABLE
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="section1"
><A NAME="1"
><A NAME="1"
>Basic hash table operations
</A
></A
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="decl"
><SPAN CLASS="keyword"
>data</SPAN
>  <A NAME="t:HashTable"
><A NAME="t%3AHashTable"
></A
></A
><B
>HashTable</B
> key val </TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="decl"
><A NAME="v:new"
><A NAME="v%3Anew"
></A
></A
><B
>new</B
></TD
></TR
><TR
><TD CLASS="body"
><TABLE CLASS="vanilla" CELLSPACING="0" CELLPADDING="0"
><TR
><TD CLASS="arg"
>:: </TD
><TD CLASS="rdoc"
></TD
></TR
><TR
><TD CLASS="arg"
>=&gt; key -&gt; key -&gt; <A HREF="../ghc-prim/GHC-Bool.html#t%3ABool"
>Bool</A
></TD
><TD CLASS="rdoc"
><TT
>eq</TT
>: An equality comparison on keys
</TD
></TR
><TR
><TD CLASS="arg"
>-&gt; key -&gt; <A HREF="Data-Int.html#t%3AInt32"
>Int32</A
></TD
><TD CLASS="rdoc"
><TT
>hash</TT
>: A hash function on keys
</TD
></TR
><TR
><TD CLASS="arg"
>-&gt; <A HREF="System-IO.html#t%3AIO"
>IO</A
> (<A HREF="Data-HashTable.html#t%3AHashTable"
>HashTable</A
> key val)</TD
><TD CLASS="rdoc"
>Returns: an empty hash table
</TD
></TR
><TR
><TD CLASS="ndoc" COLSPAN="2"
><P
>Creates a new hash table.  The following property should hold for the <TT
>eq</TT
>
 and <TT
>hash</TT
> functions passed to <TT
><A HREF="Data-HashTable.html#v%3Anew"
>new</A
></TT
>:
</P
><PRE
>   eq A B  =&gt;  hash A == hash B
</PRE
></TD
></TR
></TABLE
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="decl"
><A NAME="v:insert"
><A NAME="v%3Ainsert"
></A
></A
><B
>insert</B
> ::  <A HREF="Data-HashTable.html#t%3AHashTable"
>HashTable</A
> key val -&gt; key -&gt; val -&gt; <A HREF="System-IO.html#t%3AIO"
>IO</A
> <A HREF="../ghc-prim/GHC-Unit.html#t%3A%28%29"
>()</A
></TD
></TR
><TR
><TD CLASS="doc"
><P
>Inserts a key/value mapping into the hash table.
</P
><P
>Note that <TT
><A HREF="Data-HashTable.html#v%3Ainsert"
>insert</A
></TT
> doesn't remove the old entry from the table -
 the behaviour is like an association list, where <TT
><A HREF="Data-HashTable.html#v%3Alookup"
>lookup</A
></TT
> returns
 the most-recently-inserted mapping for a key in the table.  The
 reason for this is to keep <TT
><A HREF="Data-HashTable.html#v%3Ainsert"
>insert</A
></TT
> as efficient as possible.  If
 you need to update a mapping, then we provide <TT
><A HREF="Data-HashTable.html#v%3Aupdate"
>update</A
></TT
>.
</P
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="decl"
><A NAME="v:delete"
><A NAME="v%3Adelete"
></A
></A
><B
>delete</B
> ::  <A HREF="Data-HashTable.html#t%3AHashTable"
>HashTable</A
> key val -&gt; key -&gt; <A HREF="System-IO.html#t%3AIO"
>IO</A
> <A HREF="../ghc-prim/GHC-Unit.html#t%3A%28%29"
>()</A
></TD
></TR
><TR
><TD CLASS="doc"
>Remove an entry from the hash table.
</TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="decl"
><A NAME="v:lookup"
><A NAME="v%3Alookup"
></A
></A
><B
>lookup</B
> ::  <A HREF="Data-HashTable.html#t%3AHashTable"
>HashTable</A
> key val -&gt; key -&gt; <A HREF="System-IO.html#t%3AIO"
>IO</A
> (<A HREF="Data-Maybe.html#t%3AMaybe"
>Maybe</A
> val)</TD
></TR
><TR
><TD CLASS="doc"
>Looks up the value of a key in the hash table.
</TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="decl"
><A NAME="v:update"
><A NAME="v%3Aupdate"
></A
></A
><B
>update</B
> ::  <A HREF="Data-HashTable.html#t%3AHashTable"
>HashTable</A
> key val -&gt; key -&gt; val -&gt; <A HREF="System-IO.html#t%3AIO"
>IO</A
> <A HREF="../ghc-prim/GHC-Bool.html#t%3ABool"
>Bool</A
></TD
></TR
><TR
><TD CLASS="doc"
><P
>Updates an entry in the hash table, returning <TT
><A HREF="../ghc-prim/GHC-Bool.html#v%3ATrue"
>True</A
></TT
> if there was
 already an entry for this key, or <TT
><A HREF="../ghc-prim/GHC-Bool.html#v%3AFalse"
>False</A
></TT
> otherwise.  After <TT
><A HREF="Data-HashTable.html#v%3Aupdate"
>update</A
></TT
>
 there will always be exactly one entry for the given key in the table.
</P
><P
><TT
><A HREF="Data-HashTable.html#v%3Ainsert"
>insert</A
></TT
> is more efficient than <TT
><A HREF="Data-HashTable.html#v%3Aupdate"
>update</A
></TT
> if you don't care about
 multiple entries, or you know for sure that multiple entries can't
 occur.  However, <TT
><A HREF="Data-HashTable.html#v%3Aupdate"
>update</A
></TT
> is more efficient than <TT
><A HREF="Data-HashTable.html#v%3Adelete"
>delete</A
></TT
> followed
 by <TT
><A HREF="Data-HashTable.html#v%3Ainsert"
>insert</A
></TT
>.
</P
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="section1"
><A NAME="2"
><A NAME="2"
>Converting to and from lists
</A
></A
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="decl"
><A NAME="v:fromList"
><A NAME="v%3AfromList"
></A
></A
><B
>fromList</B
> :: <A HREF="Data-Eq.html#t%3AEq"
>Eq</A
> key =&gt; (key -&gt; <A HREF="Data-Int.html#t%3AInt32"
>Int32</A
>) -&gt; [(key, val)] -&gt; <A HREF="System-IO.html#t%3AIO"
>IO</A
> (<A HREF="Data-HashTable.html#t%3AHashTable"
>HashTable</A
> key val)</TD
></TR
><TR
><TD CLASS="doc"
>Convert a list of key/value pairs into a hash table.  Equality on keys
 is taken from the Eq instance for the key type.
</TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="decl"
><A NAME="v:toList"
><A NAME="v%3AtoList"
></A
></A
><B
>toList</B
> ::  <A HREF="Data-HashTable.html#t%3AHashTable"
>HashTable</A
> key val -&gt; <A HREF="System-IO.html#t%3AIO"
>IO</A
> [(key, val)]</TD
></TR
><TR
><TD CLASS="doc"
>Converts a hash table to a list of key/value pairs.
</TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="section1"
><A NAME="3"
><A NAME="3"
>Hash functions
</A
></A
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="doc"
><P
>This implementation of hash tables uses the low-order <EM
>n</EM
> bits of the hash
 value for a key, where <EM
>n</EM
> varies as the hash table grows.  A good hash
 function therefore will give an even distribution regardless of <EM
>n</EM
>.
</P
><P
>If your keyspace is integrals such that the low-order bits between
 keys are highly variable, then you could get away with using <TT
><A HREF="Prelude.html#v%3AfromIntegral"
>fromIntegral</A
></TT
>
 as the hash function.
</P
><P
>We provide some sample hash functions for <TT
><A HREF="../ghc-prim/GHC-Types.html#t%3AInt"
>Int</A
></TT
> and <TT
><A HREF="Data-Char.html#t%3AString"
>String</A
></TT
> below.
</P
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="decl"
><A NAME="v:hashInt"
><A NAME="v%3AhashInt"
></A
></A
><B
>hashInt</B
> :: <A HREF="../ghc-prim/GHC-Types.html#t%3AInt"
>Int</A
> -&gt; <A HREF="Data-Int.html#t%3AInt32"
>Int32</A
></TD
></TR
><TR
><TD CLASS="doc"
><P
>A sample (and useful) hash function for Int and Int32,
 implemented by extracting the uppermost 32 bits of the 64-bit
 result of multiplying by a 33-bit constant.  The constant is from
 Knuth, derived from the golden ratio:
</P
><PRE
> golden = round ((sqrt 5 - 1) * 2^32)
</PRE
><P
>We get good key uniqueness on small inputs
 (a problem with previous versions):
  (length $ group $ sort $ map hashInt [-32767..65536]) == 65536 + 32768
</P
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="decl"
><A NAME="v:hashString"
><A NAME="v%3AhashString"
></A
></A
><B
>hashString</B
> :: <A HREF="Data-Char.html#t%3AString"
>String</A
> -&gt; <A HREF="Data-Int.html#t%3AInt32"
>Int32</A
></TD
></TR
><TR
><TD CLASS="doc"
><P
>A sample hash function for Strings.  We keep multiplying by the
 golden ratio and adding.  The implementation is:
</P
><PRE
> hashString = foldl' f golden
   where f m c = fromIntegral (ord c) * magic + hashInt32 m
         magic = 0xdeadbeef
</PRE
><P
>Where hashInt32 works just as hashInt shown above.
</P
><P
>Knuth argues that repeated multiplication by the golden ratio
 will minimize gaps in the hash space, and thus it's a good choice
 for combining together multiple keys to form one.
</P
><P
>Here we know that individual characters c are often small, and this
 produces frequent collisions if we use ord c alone.  A
 particular problem are the shorter low ASCII and ISO-8859-1
 character strings.  We pre-multiply by a magic twiddle factor to
 obtain a good distribution.  In fact, given the following test:
</P
><PRE
> testp :: Int32 -&gt; Int
 testp k = (n - ) . length . group . sort . map hs . take n $ ls
   where ls = [] : [c : l | l &lt;- ls, c &lt;- ['\0'..'\xff']]
         hs = foldl' f golden
         f m c = fromIntegral (ord c) * k + hashInt32 m
         n = 100000
</PRE
><P
>We discover that testp magic = 0.
</P
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="decl"
><A NAME="v:prime"
><A NAME="v%3Aprime"
></A
></A
><B
>prime</B
> :: <A HREF="Data-Int.html#t%3AInt32"
>Int32</A
></TD
></TR
><TR
><TD CLASS="doc"
>A prime larger than the maximum hash table size
</TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="section1"
><A NAME="4"
><A NAME="4"
>Diagnostics
</A
></A
></TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="decl"
><A NAME="v:longestChain"
><A NAME="v%3AlongestChain"
></A
></A
><B
>longestChain</B
> ::  <A HREF="Data-HashTable.html#t%3AHashTable"
>HashTable</A
> key val -&gt; <A HREF="System-IO.html#t%3AIO"
>IO</A
> [(key, val)]</TD
></TR
><TR
><TD CLASS="doc"
>This function is useful for determining whether your hash
 function is working well for your data set.  It returns the longest
 chain of key/value pairs in the hash table for which all the keys
 hash to the same bucket.  If this chain is particularly long (say,
 longer than 14 elements or so), then it might be a good idea to try
 a different hash function.
</TD
></TR
><TR
><TD CLASS="s15"
></TD
></TR
><TR
><TD CLASS="botbar"
>Produced by <A HREF="http://www.haskell.org/haddock/"
>Haddock</A
> version 2.4.2</TD
></TR
></TABLE
></BODY
></HTML
>