Sophie

Sophie

distrib > Mandriva > 2010.0 > i586 > media > contrib-release > by-pkgid > 5e1854624d3bc613bdd0dd13d1ef9ac7 > files > 2600

gap-system-4.4.12-5mdv2010.0.i586.rpm

<html><head><title>[ParGAP] 6 Advanced Concepts for TOP-C model (MasterSlave)</title></head>
<body text="#000000" bgcolor="#ffffff">
[<a href = "chapters.htm">Up</a>] [<a href ="CHAP005.htm">Previous</a>] [<a href ="CHAP007.htm">Next</a>] [<a href = "theindex.htm">Index</a>]
<h1>6 Advanced Concepts for TOP-C model (MasterSlave)</h1><p>
<P>
<H3>Sections</H3>
<oL>
<li> <A HREF="CHAP006.htm#SECT001">Tracing and Debugging</a>
<li> <A HREF="CHAP006.htm#SECT002">Efficiency Considerations</a>
<li> <A HREF="CHAP006.htm#SECT003">Checkpointing in TOP-C</a>
<li> <A HREF="CHAP006.htm#SECT004">When Should a Slave Process be Considered Dead?</a>
</ol><p>
<p>
This chapter may be safely skipped on a first reading. If you still  want
to read this chapter, it should mean  that  you  are  familiar  with  the
basics of the TOP-C model, and are looking for advice on how to  use  the
model more effectively. The first piece of advice is that the  choice  of
<strong>task</strong> and <strong>shared data</strong> interact strongly with the  choice  of  parallel
algorithm. We review those concepts more precisely here, in light of  the
overall context of the TOP-C model.
<p>
<p>
<dl compact>
<p>
<a name = "I0"></a>

<dt><strong>task</strong>:<dd>
    A task is a function that that takes a single argument,  <var>taskInput</var>,
    reads certain globally shared data, the <strong>shared data</strong>, and computes a
    result, the <var>taskOutput</var>. Hence, given the same task  input  and  the
    same shared data, a task should always compute the same task  output.
    The TOP-C model implements  this  concept  through  the  <code></code><var>DoTask</var><code>()</code>
    application routine. In  the  TOP-C  model,  this  rule  is  bent  to
    accommodate  caching  of  private  data  to  efficiently   handle   a
    <code>REDO_ACTION</code>    (see    Section&nbsp;<a href="CHAP005.htm#SECT006">Caching    slave    task    outputs     (ParSemiEchelonMat    revisited)</a>),    or    to     accommodate     a
    <code>CONTINUATION_ACTION()</code> (see Section&nbsp;<a href="CHAP003.htm#SECT006">The GOTO statement of the TOP-C     model</a>).
<p>
<a name = "I1"></a>

<dt><strong>shared data</strong>:<dd>
    The <strong>shared data</strong> is globally shared data. It should  be  initialized
    before entering <code>MasterSlave()</code>. The shared data is never  explicitly
    declared. However, it is  best  for  the  application  programmer  to
    include  a  comment  specifying  the  shared  data  for  his  or  her
    application. The TOP-C  model  poses  certain  restrictions  on  what
    legally constitutes shared data. The shared data must include  enough
    of the global data (variables that occur <strong>free</strong> in  the  <code></code><var>DoTask</var><code>()</code>
    procedure) so that  the  task  output  of  <code></code><var>DoTask</var><code>()</code>  is  uniquely
    determined by the task input and the shared data. However, the shared
    data must <strong>not</strong> include  any  variables  whose  values  are  modified
    outside of the application routine <code></code><var>UpdateSharedData</var><code>()</code>. Also,  the
    shared data is updated <strong>non-preemptively</strong>, in the sense that a  slave
    process will always complete its current task before reading a  newly
    arrived message  that  invokes  <code></code><var>UpdateSharedData</var><code>()</code>.  If  a  slave
    privately  caches  data  for   purposes   of   a   <code>REDO_ACTION</code>   or
    <code>CONTINUATION_ACTION()</code>, such data is  explicitly  not  part  of  the
    shared data.
<p>
</dl>
<p>
<p>
<a name="SECT001"><h2>6.1 Tracing and Debugging</h2></a>
<p><p>
In testing a program using <code>MasterSlave()</code>, a  hierarchy  of  testing  is
suggested. The principle is to test the simplest example first, and  then
iterate to more complex examples. When a stable portion of the program is
ready for testing, the following sequence of tests is suggested:
<p>
<p>
<dl compact>
<p>
<dt>sequential<dd> 
    Replace <code>MasterSlave()</code> by <code>SeqMasterSlave()</code> (see definition  below)
    and see if the program performs  correctly.  <code>SeqMasterSlave()</code>  will
    run only on the master, without sending any messages, and so the full
    range of sequential debugging tools is available.
<p>
<dt>one slave<dd>
    Restore <code>MasterSlave()</code> and set up the <code>procgroup</code> file to have  only
    one slave process (one line,  <code>local  0</code>,  and  one  line  <code>localhost
    ...</code>).  Initially   test   with   no   <code>taskAgglom</code>   parameter   for
    <code>MasterSlave()</code>, and then test with the full set of parameters.
<p>
<dt>two slaves<dd>
    Same advice as for one slave, but two lines: <code>localhost ...</code>
<p>
<dt>many slaves<dd>
    Full scale test, both without and with <code>taskAgglom</code>.
<p>
</dl>
<p>
<a name = "SSEC001.1"></a>
<li><code>ParTrace V</code>
<p>
A second easy testing strategy is to set <code>ParTrace</code> to <code>true</code>.  (This  is
the default value.) This  causes  all  <var>taskInput</var>s,  <var>taskOutput</var>s,  and
non-trivial actions (actions other than <code>NO_ACTION</code>) to be  displayed  at
the terminal. The information is printed in the same sequence as seen  by
the master process.
<p>
Another ``cheap'' debugging trick is to  inspect  the  values  of  global
variables  on  the  slave  after  it  has  been   thrown   out   of   the
<code>MasterSlave()</code>   procedure.   The   following   code   demonstrates   by
interrogating the sum of the variables <code>x</code> and <code>y</code> on slave number&nbsp;2.
<p>
<pre>
gap&gt; SendRecvMsg("x+y;\n", 2);
</pre>
<p>
This is useful to inspect cached data on a slave used for a <code>REDO_ACTION</code>
or <code>CONTINUATION_ACTION()</code>. It may  also  be  useful  to  verify  if  the
shared data on the slave is the same as  on  the  master.  If  the  slave
process is still inside the procedure <code>MasterSlave()</code>, then from within a
break loop on the  master,  you  may  also  want  to  interactively  call
<code></code><var>DoTask</var><code>( </code><var>testInput</var><code> )</code> to determine if the  expected  <var>taskOutput</var>  is
produced.
<p>
If the master process is still within <code>MasterSlave()</code>, then it is  useful
to execute <code></code><var>DoTask</var><code>()</code> locally on the master  process,  and  debug  this
sequentially.
<p>
There is also the time-honored practice of  inserting  print  statements.
Print statements ``work'' both on  the  master  and  on  the  slaves.  If
ParTrace produces too much output, or not the right kind of  information,
one can add print statements exactly where one needs them.  As  with  any
UNIX  debugging,  it  is  sometimes  useful  to   include   a   call   to
<code>fflush(stdout)</code> to force any pending output. <font face="Gill Sans,Helvetica,Arial">ParGAP</font> binds this to:
<p>
<a name = "SSEC001.2"></a>
<li><code>UNIX_FflushStdout() F</code>
<p>
This has the same effect as  the  UNIX  <code>fflush(stdout)</code>.  There  may  be
pending output in a buffer, that UNIX  delays  printing  for  efficiency.
Printing any remaining output in the buffer is forced by this command.  A
common sequence is:  <code>Print("information");  UNIX_FflushStdout();</code>.  Note
also that when the slave  prints,  there  are  ``two''  standard  outputs
involved. You may also want to include a call to <code>UNIX_FflushStdout()</code> on
the master to force any  pending  output  that  originated  on  a  slave.
Finally, you should be conscious  of  network  delays,  and  so  a  print
statement in a slave process will typically take longer to appear than  a
print statement in the master process.
<p>
<a name = "SSEC001.3"></a>
<li><code>SeqMasterSlave( </code><var>SubmitTaskInput</var><code>, </code><var>DoTask</var><code>[, </code><var>CheckTaskResult</var><code>[,</code><var>UpdateSharedData</var><code>[, </code><var>taskAgglom</var><code> ]]] ) F</code>
<p>
If a bug is exhibited even in the context of a  single  slave,  then  the
code is ``almost'' sequential. In this case,  one  can  test  further  by
replacing the call to <code>MasterSlave()</code> by a  call  to  <code>SeqMasterSlave()</code>,
and debug in a context that involves zero  messages  and  no  interaction
with any slave. It can also be helpful to carry out initial debugging  in
this context. Note that in the case of a  single  slave,  which  is  what
<code>SeqMasterSlave</code> emulates, <code>IsUpToDate()</code> will always return true, and so
most applications will not call for a <code>REDO_ACTION</code>.
<p>
<p>
<a name="SECT002"><h2>6.2 Efficiency Considerations</h2></a>
<p><p>
<a name = "I2"></a>

There  are  two  common  reasons  for  loss  of  efficiency  in  parallel
applications. One is a lack of enough tasks,  so  that  some  slaves  are
starverd for work while waiting for the next task input. A  second  reson
is that the ratio of communication time to compilation time is too large.
The second case, poor communication efficiency, is the more common one.
<p>
The communication efficiency can be formally defined as the ratio of  the
time to execute a task by the time  taken  for  the  master  to  send  an
initial task message to a slave plus the time for the slave to send  back
a result message. A good way to diagnose your efficiency  is  to  execute
<code>MasterSlaveStats()</code> after executing <code>MasterSlave()</code>.
<p>
<a name = "SSEC002.1"></a>
<li><code>MasterSlaveStats() F</code>
<p>
This function currently returns statistics in the form of a list  of  two
records. The first record provides the global information:
<p>
<p>
<dl compact>
<p>
<dt><code>MStime</code><dd>
    total runtime (as measured by <code>Runtime()</code>),
<p>
<dt><code>MSnrTasks</code><dd>
    total number of tasks (not including <code>REDO</code> or <code>UPDATE</code>,
<p>
<dt><code>MSnrUpdates</code><dd>
    total number of times action <code>UPDATE</code> was returned, and
<p>
<dt><code>MSnrRedos</code><dd>
    total number of times action <code>REDO</code> was returned.
<p>
</dl>
<p>
The second record provides per-slave information:
<p>
<p>
<dl compact>
<p>
<dt><code>total</code><dd>
    total time spent on tasks, not including <code>UpdateSharedData()</code>,
<p>
<dt><code>num</code><dd>
    number of initial tasks, <code>REDO</code> and <code>CONTINUATION()</code> actions,
<p>
<dt><code>ave_ms</code><dd> 
    the value of <code>QuoInt(1000*total,num)</code> in <font face="Gill Sans,Helvetica,Arial">GAP</font>, and
<p>
<dt><code>max</code><dd>
    maximum time spent on a task, in seconds.
<p>
</dl>
<p>
Note that  for  purposes  of  the  per-slave  statistics,  separate  time
intervals  are  recorded  for  each  initial  task,  <code>REDO</code>  action,  and
<code>CONTINUATION()</code>  action.  The  time  for  <code>UpdateSharedData()</code>  is  not
included in these statistics. This is because after an  <code>UPDATE</code>  action,
the slave does not reply to the master to acknowledge when the update was
completed.
<p>
<strong>Notes:</strong>
<p>
Poor communication efficiency is typically caused either by too  small  a
task execution time (which would be the case in the example of section or
too large a message (in which case the communication time is  too  long).
We first consider execution times that are too small.
<p>
On many Ethernet installations, the  communication  time  is  about  0.01
seconds to send and receive small messages (less than 1  Kb).  Hence  the
task should be adjusted to consume at least this much CPU  time.  If  the
naturally defined task requires less than  0.01  seconds,  the  user  can
often group together several consecutive tasks, and send them as a single
larger task. For example, in the factorization problem  of  section,  one
might modify <code></code><var>DoTask</var><code>()</code> to test the next 1000 numbers  as  factors  and
modify <code></code><var>SubmitTaskInput</var><code>()</code> to increment <code>counter</code> by&nbsp;1000.
<p>
There is another easy trick that often improves communication efficiency.
This is to set up more than one slave process  on  each  processor.  This
improves the communication efficiency because during much of the  typical
0.01&nbsp;seconds of communication time the CPU has off-loaded the job onto  a
coprocessor. Hence, having a second slave process running its own task on
the CPU while a first process is concerned with communication allows  one
to <strong>overlap communication with computation</strong>.
<p>
We next consider the case of messages that are too large. In  this  case,
it  is  important  to  structure  the  problem  appropriately.  The  task
architecture is intended to be especially adaptable  to  this  case.  The
philosophy is to minimize communication time by duplicating much  of  the
execution time on each processor.
<p>
After the initial data structure has  been  built,  it  will  usually  be
modified as a result of the  computation.  In  order  to  again  minimize
communication, the result  of  a  task,  which  is  typically  passed  to
<code></code><var>UpdateSharedData</var><code>()</code>, should consist of the minimum information needed
to update the global data structure. Each process can then  perform  this
update in parallel.
<p>
<p>
<a name="SECT003"><h2>6.3 Checkpointing in TOP-C</h2></a>
<p><p>
Any long-running computation <strong>must</strong> be concerned with checkpointing. The
TOP-C model also provides a  simple  model  for  checkpointing.  The  key
observation is that the master process always has the latest state of the
computation, and the information in the master process is  sufficient  to
reconstruct any ongoing computation. Any application may  take  advantage
of  this  by  checkpointing  the  necessary  information  either  in  the
application routine, <code></code><var>SubmitTaskInput</var><code>()</code> or in <code></code><var>CheckTaskResult</var><code>()</code>.
<p>
A simple way to checkpoint is to record:
<p>
<ul>
<p>
<li>
    the current data in the TOP-C shared data;
<p>
<li>
    any private global data residing only in the master process; and
<p>
<li>
    the inputs to any tasks that are still pending.
<p>
</ul>
<p>
This  model  for  checkpointing  assumes  that  your   program   has   no
<code>CONTINUATION()</code> actions. If you use <code>CONTINUATION()</code> actions,  then  you
may require a more complex model for checkpointing.
<p>
<a name = "SSEC003.1"></a>
<li><code>MasterSlavePendingTaskInputs() F</code>
<p>
This function returns a <font face="Gill Sans,Helvetica,Arial">GAP</font> list (with  holes)  of  all  pending  task
inputs. If slave <var>i</var> is currently working on a task,  index  <var>i</var>  of  the
list will record that task. If slave <var>i</var> is currently idle  or  executing
<code></code><var>UpdateSharedData</var><code>()</code>, then there will be a hole  at  index  <var>i</var>.  This
function is available for  use  within  either  the  application  routine
<code></code><var>SubmitTaskInput</var><code>()</code>,  or  <code></code><var>CheckTaskResult</var><code>()</code>,   as   specified   in   the
parameters of your call to <code>MasterSlave()</code>. (Of course, your  application
may be using a name other than <code></code><var>SubmitTaskInput</var><code>()</code> or  <code></code><var>CheckTaskResult</var><code>()</code>
in the parameters of <code>MasterSlave()</code>.)
<p>
<p>
<a name="SECT004"><h2>6.4 When Should a Slave Process be Considered Dead?</h2></a>
<p><p>
An important question for long-running computations, is  when  to  decide
that a slave  process  is  dead.  For  our  purposes,  <strong>dead</strong>  is  not  a
well-defined concept. If a user on the remote machine decides to re-boot,
it is clear that any slave processes residing on that machine  should  be
declared dead. However, suppose there  is  temporary  congestion  on  the
network making the slave unavailable. Suppose that another  user  on  the
remote machine has started up many processes  consuming  many  resources,
and the TOP-C slave process is being starved for CPU  time  or  for  RAM.
Perhaps the most difficult case of all to decide  is  if  one  particular
TOP-C task requires ten times as much time as all other tasks. This  last
example is conceivable if, for example, each task consists of factoring a
different large integer.
<p>
Hence, our implementation of the TOP-C model will  employ  the  following
<strong>heuristic</strong> in a future version, to decide if a task  is  dead.  You  may
wish to employ this heuristic now, if you have a  demanding  application.
We use the <font face="Gill Sans,Helvetica,Arial">ParGAP</font> function, <code>UNIX_Realtime()</code>, to keep  track  of  how
much time has been spent on a task (based on ``wall clock time'', and not
on CPU time). If a task has taken  <code>slaveTaskTimeFactor</code>  times  as  much
time as the longest task so far, then it becomes a  candidate  for  being
declared dead. The <font face="Gill Sans,Helvetica,Arial">GAP</font> variable <code>slaveTaskTimeFactor</code> is initially set
to the default value of 2.
<p>
Once a slave  process  becomes  a  candidate  for  being  declared  dead,
<code>MasterSlave()</code> will create a second version of the same task,  with  the
same task input as the original task. <code>MasterSlave()</code>  will  then  record
which task finishes first. If the original version finishes  first,  then
the second version  of  the  task  is  ignored,  and  the  slave  process
executing the original task is  no  longer  considered  a  candidate  for
death.
<p>
If, however, the second version of the task finishes before the  original
version, then the time for the second  task  is  recorded.  Further,  the
output from the second task will be used, and any output  resulting  from
the original task will  be  ignored.  <code>MasterSlave()</code>  then  periodically
checks until the ration of the time spent so far on the original  version
of the task is at least <code>slaveTaskTimeFactor</code> times greater than the time
spent on the second version of the task, then the process  executing  the
original version of the task is then declared dead. No  further  messages
from the process executing  original  task  will  be  recognized  and  no
further messages will be sent to that slave process.
<p>
A future version of this distribution will  include  direct  support  for
this heuristic. A customized version of it may be  used  now,  by  taking
advantage of the <font face="Gill Sans,Helvetica,Arial">ParGAP</font> routine,  <code>UNIX_Realtime()</code>.  In  addition,  a
future version of this distribution may include the ability to start  new
slave processes in  an  ongoing  computation.  The  reference&nbsp;<a href="biblio.htm#CG98"><cite>CG98</cite></a>
describes how this was done in a C implementation, and why  this  concept
fits naturally with the TOP-C model.
<p>
<p>
[<a href = "chapters.htm">Up</a>] [<a href ="CHAP005.htm">Previous</a>] [<a href ="CHAP007.htm">Next</a>] [<a href = "theindex.htm">Index</a>]
<P>
<address>ParGAP manual<br>November 2001
</address></body></html>