<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.0//EN" "http://www.oasis-open.org/docbook/xml/4.0/docbookx.dtd"> <article> <title>The Hoard Memory Allocator</title> <articleinfo> <author> <firstname>Emery</firstname> <surname>Berger</surname> <affiliation>University of Massachusetts Amherst</affiliation> <street>Department of Computer Science</street> <city>Amherst</city> <state>Massachusetts</state> <country>USA</country> <email>emery@cs.umass.edu</email> </author> <pubdate>2004-12-08</pubdate> <revhistory> <revision> <revnumber>1.1</revnumber> <date>2004-12-08</date> <authorinitials>EDB</authorinitials> <revremark>Improved formatting</revremark> </revision> <revision> <revnumber>1.0</revnumber> <date>2004-12-06</date> <authorinitials>EDB</authorinitials> <revremark>First draft</revremark> </revision> </revhistory> <copyright> <year>2004</year> <holder role="mailto:emery@cs.umass.edu">Emery Berger</holder> </copyright> <abstract> Documentation for the Hoard scalable memory allocator, including build and usage directions for several platforms. </abstract> </articleinfo> <!-- add mailing list and website info. proper web links big line between sections --> <blockquote> <para> <emphasis role="strong">hoard:</emphasis> To amass and put away (anything valuable) for preservation, security, or future use; to treasure up: esp. money or wealth. </para> <para><emphasis>Oxford English Dictionary</emphasis> </para> </blockquote> <sect1 id="intro"> <title>Introduction</title> <para> The Hoard memory allocator is a fast, scalable, and memory-efficient memory allocator for shared-memory multiprocessors. It runs on a variety of platforms, including Linux, Solaris, and Windows. </para> <sect2> <title>Why Hoard?</title> <sect3> <title>Contention</title> <para> Multithreaded programs often do not scale because the heap is a bottleneck. When multiple threads simultaneously allocate or deallocate memory from the allocator, the allocator will serialize them. Programs making intensive use of the allocator actually slow down as the number of processors increases. Your program may be allocation-intensive without you realizing it, for instance, if your program makes many calls to the C++ Standard Template Library (STL). </para> </sect3> <sect3> <title>False Sharing</title> <para> The allocator can cause other problems for multithreaded code. It can lead to <emphasis>false sharing</emphasis> in your application: threads on different CPUs can end up with memory in the same cache line, or chunk of memory. Accessing these falsely-shared cache lines is hundreds of times slower than accessing unshared cache lines. </para> </sect3> <sect3> <title>Blowup</title> <para> Multithreaded programs can also lead the allocator to blowup memory consumption. This effect can multiply the amount of memory needed to run your application by the number of CPUs on your machine: four CPUs could mean that you need four times as much memory. Hoard is a fast allocator that solves all of these problems. </para> </sect3> </sect2> <sect2> <title>How Do I Use Hoard?</title> <para> Hoard is a drop-in replacement for malloc(), etc. In general, you just link it in or set just one environment variable. You do not have to change your source code in any way. See the section "Windows Builds" below for more information for particular platforms. </para> </sect2> <sect2> <title>Who's Using Hoard?</title> <para> Companies using Hoard in their products and servers include <ulink url="http://www.aol.com">AOL</ulink>, <ulink url="http://www.bt.com">British Telecom</ulink>, <ulink url="http://www.businessobjects.com">Business Objects</ulink> (formerly Crystal Decisions), <ulink url="http://www.entrust.com">Entrust</ulink>, <ulink url="http://www.novell.com">Novell</ulink>, <ulink url="http://www.openwave.com">OpenWave Systems</ulink> (for their Typhoon and Twister servers), and <ulink url="http://www.reuters.com">Reuters</ulink>. </para> <para> Open source projects using Hoard include the Bayonne GNU telephony server, the <ulink url="http://supertech.lcs.mit.edu/cilk/">Cilk</ulink> parallel programming language, the <ulink url="http://www.cs.dartmouth.edu/research/DaSSF/index.html">Dartmouth Scalable Simulation Framework</ulink>, and the <ulink url="http://www.gnu.org/software/commoncpp/">GNU Common C++</ulink> system. </para> </sect2> </sect1> <sect1 id="Building Hoard"> <title>Building Hoard</title> <para> You can use the available pre-built binaries or build Hoard yourself. Hoard is written to work on Windows and any variant of UNIX that supports threads, and should compile out of the box. Rather than using Makefiles or configure scripts, Hoard includes custom scripts that all start with the prefix compile. </para> <sect2> <title>Platform-specific directions</title> <sect2> <title>Linux and Solaris Builds</title> <para> You can compile Hoard out of the box for Linux and Solaris using the GNU compilers (g++) just by running the <filename>compile</filename> script: <programlisting> ./compile </programlisting> </para> </sect2> <sect2> <title>Windows Builds</title> <para> There are now three alternative ways of using Hoard with Windows. <itemizedlist> <listitem> <para> The first approach builds a DLL, <filename>libhoard.dll</filename> and its associated library <filename>libhoard.lib</filename>. <programlisting> .\compile-dll </programlisting> </para> </listitem> <listitem> <para> The second approach relies on Microsoft Research's <ulink url="http://research.microsoft.com/sn/detours">Detours</ulink>. With Detours, you can take advantage of Hoard without having to relink your applications. Install Detours into <filename class="directory">C:\detours</filename>, and then build the Hoard detours library: <programlisting> .\compile-detours </programlisting> </para> </listitem> <listitem> <para> The third approach generates winhoard, which replaces malloc/new calls in your program <emphasis>and</emphasis> in any DLLs it might use. <programlisting> .\compile-winhoard </programlisting> </para> </listitem> </itemizedlist> </para> </sect2> </sect2> </sect1> <sect1 id="Using Hoard"> <title>Using Hoard</title> <sect2> <title>UNIX</title> <para> In UNIX, you can use the <envar>LD_PRELOAD</envar> variable to use Hoard instead of the system allocator for any program not linked with the "static option" (that's most programs). Below are settings for Linux and Solaris. </para> <sect3> <title>Linux</title> <para> <programlisting> LD_PRELOAD="/path/libhoard.so:/usr/lib/libdl.so" </programlisting> </para> </sect3> <sect3> <title>Solaris</title> <para> Depending on whether you are using the GNU-compiled version (as produced by <filename>compile</filename>) or the Sun Workshop-compiled versions (produced by <filename>compile-sunw</filename>), your settings will be slightly different. </para> <informaltable frame="none"> <tgroup cols="2"> <colspec colwidth="1in"/> <thead> <row> <entry>Version</entry> <entry>Setting</entry> </row> </thead> <tbody> <row valign="center"> <entry>GNU-compiled</entry> <entry> <programlisting> LD_PRELOAD="/path/libhoard.so:/usr/lib/libdl.so" </programlisting> </entry> </row> <row valign="center"> <entry>Sun-compiled (<emphasis>32-bits</emphasis>)</entry> <entry> <programlisting> LD_PRELOAD="/path/libhoard_32.so" </programlisting> </entry> </row> <row valign="center"> <entry>Sun-compiled (<emphasis>64-bits</emphasis>)</entry> <entry> <programlisting> LD_PRELOAD="/path/libhoard_64.so:/usr/lib/64/libCrun.so.1:/usr/lib/64/libdl.so" </programlisting> </entry> </row> </tbody> </tgroup> </informaltable> <note> <para> For some security-sensitive applications, Solaris requires you place libraries used in <envar>LD_PRELOAD</envar> into the <filename class="directory">/usr/lib/secure</filename> directory. In that event, after copying these libraries into <filename class="directory">/usr/lib/secure</filename>, set <envar>LD_PRELOAD</envar> by omitting the absolute locations of the libraries, as follows: <programlisting> LD_PRELOAD="libhoard.so:libCrun.so.1:libdl.so" </programlisting> </para> </note> </sect3> </sect2> <sect2> <title>Windows</title> <para> There are three ways to use Hoard on Windows. </para> <orderedlist> <listitem> Using Detours <para> By using Detours, you can take advantage of Hoard's benefits without relinking your Windows application (as long as it is dynamically linked to the C runtime libraries). You will need to use one of the two included Detours tools (<filename>setdll.exe</filename> or <filename>withdll.exe</filename> in the <filename class="directory">detours/</filename> directory) in conjunction with this version of Hoard. To <emphasis>temporarily</emphasis> use Hoard as the allocator for a given application, use <filename>withdll</filename>: <programlisting> withdll -d:hoarddetours.dll myprogram.exe </programlisting> If you want your program to use Hoard without having to invoke <filename>withdll</filename> every time, you can use <filename>setdll</filename> to add it to your executable: <programlisting> setdll -d:hoarddetours.dll myprogram.exe myprogram.exe </programlisting> You can later remove Hoard from your executable as follows: <programlisting> setdll -r:hoarddetours.dll myprogram.exe </programlisting> </para> </listitem> <listitem> Using <filename>winhoard</filename> <para> Another method is to use <filename>winhoard</filename>. Winhoard, like Detours, replaces malloc/new calls from your program and any DLLs it might use (leaving <filename>HeapAlloc</filename> calls intact). One advantage is that it does not require Detours to do this. To use the Winhoard version, link your executable with <filename>usewinhoard.obj</filename> and <filename>winhoard.lib</filename>, and then use <filename>winhoard.dll</filename>: <programlisting> cl /Ox /MD /c usewinhoard.cpp cl /Ox /MD myprogram.cpp usewinhoard.obj winhoard.lib </programlisting> </para> </listitem> <listitem> Using <filename>libhoard</filename> <para> The last method is to link directly with the <filename>libhoard</filename> DLL. This approach is simple, but only suitable for small applications, since it will not affect malloc calls in any other DLL you might load. To use this option, you should put the following into your source code as the very first lines: <programlisting> #if defined(USE_HOARD) #pragma comment(lib, "libhoard.lib") #endif </programlisting> This stanza should be in the first part of a header file included by all of your code. It ensures that Hoard loads before any other library (you will need <filename>libhoard.lib</filename> in your path). When you execute your program, as long as <filename>libhoard.dll</filename> is in your path, your program will run with Hoard instead of the system allocator. Note that you must compile your program with the <filename>/MD</filename> flag, as in: <programlisting> cl /MD /G6 /Ox /DUSE_HOARD=1 myprogram.cpp </programlisting> Hoard will not work if you use another switch (like <filename>/MT</filename>) to compile your program. </para> </listitem> </orderedlist> </sect2> </sect1> <sect1 id="FAQs"> <title>Frequently Asked Questions</title> <qandaset> <qandaentry> <question> <para>What kind of applications will Hoard speed up?</para> </question> <answer> <para> Hoard will always improve the performance of multithreaded programs running on multiprocessors that make frequent use of the heap (calls to malloc/free or new/delete, as well as many STL functions). Because Hoard avoids false sharing, Hoard also speeds up programs that only occasionally call heap functions but access these objects frequently. </para> </answer> </qandaentry> <qandaentry> <question> <para>I'm using the STL but not seeing any performance improvement. Why not?</para> </question> <answer> <para> In order to benefit from Hoard, you have to tell STL to use malloc instead of its internal custom memory allocator, as in: <programlisting> typedef list<unsigned int, malloc_alloc> mylist; </programlisting> </para> </answer> </qandaentry> <qandaentry> <question> <para> Have you compared Hoard against mtmalloc or libumem?</para> </question> <answer> <para> Yes. Hoard is much faster than either. For example, here's an execution of threadtest on Solaris: <informaltable frame="none"> <tgroup cols="2"> <tbody> <row> <entry>Default:</entry> <entry>4.60 seconds</entry> </row> <row> <entry>Libmtmalloc:</entry> <entry>6.23 seconds</entry> </row> <row> <entry>Libumem:</entry> <entry>5.47 seconds</entry> </row> <row> <entry>Hoard 3.2:</entry> <entry>1.99 seconds</entry> </row> </tbody> </tgroup> </informaltable> </para> </answer> </qandaentry> <qandaentry> <question> <para> What systems does Hoard work on? </para> </question> <answer> <para> Hoard has been successfully tested on numerous Windows, Linux and Solaris systems, including a 4-processor x86 box running Windows NT/2000, a 4-processor x86 box running RedHat Linux 6.0 and 6.1, and a 16-processor Sun Enterprise server running Solaris. </para> </answer> </qandaentry> <qandaentry> <question> <para> Have you compared Hoard with SmartHeap SMP? </para> </question> <answer> <para> We tried SmartHeap SMP but it did not work on our Suns (due to an apparent race condition in the code). </para> </answer> </qandaentry> </qandaset> </sect1> <sect1 id="More Info"> <title>More Information</title> <para> The first place to look for Hoard-related information is at the Hoard web page, <ulink url="http://www.hoard.org">www.hoard.org</ulink>. </para> <para> There are two mailing lists you should consider being on if you are a user of Hoard. If you are just interested in being informed of new releases, join the <ulink url="http://groups.yahoo.com/group/hoard-announce/">Hoard-Announce</ulink> list. For general Hoard discussion, join the <ulink url="http://groups.yahoo.com/group/hoard/">Hoard</ulink> mailing list. You can also search the archives of these lists. </para> </sect1> <sect1 id="License Info"> <title>License Information</title> <para> The use and distribution of Hoard is governed by the GNU General Public License as published by the <ulink url="http://www.fsf.org">Free Software Foundation</ulink>: see the included file <filename>COPYING</filename> for more details. </para> <para> Because of the restrictions imposed by this license, most commercial users of Hoard have purchased commercial licenses, which are arranged through the University of Texas at Austin. <ulink url="mailto:software@otc.utexas.edu">Contact Richard Friedman</ulink> at the Office of Technology Commercialization at The University of Texas at Austin for more information (phone: (512) 471-4738). </para> </sect1> </article>