<html><head><title>[ParGAP] 6 Advanced Concepts for TOP-C model (MasterSlave)</title></head> <body text="#000000" bgcolor="#ffffff"> [<a href = "chapters.htm">Up</a>] [<a href ="CHAP005.htm">Previous</a>] [<a href ="CHAP007.htm">Next</a>] [<a href = "theindex.htm">Index</a>] <h1>6 Advanced Concepts for TOP-C model (MasterSlave)</h1><p> <P> <H3>Sections</H3> <oL> <li> <A HREF="CHAP006.htm#SECT001">Tracing and Debugging</a> <li> <A HREF="CHAP006.htm#SECT002">Efficiency Considerations</a> <li> <A HREF="CHAP006.htm#SECT003">Checkpointing in TOP-C</a> <li> <A HREF="CHAP006.htm#SECT004">When Should a Slave Process be Considered Dead?</a> </ol><p> <p> This chapter may be safely skipped on a first reading. If you still want to read this chapter, it should mean that you are familiar with the basics of the TOP-C model, and are looking for advice on how to use the model more effectively. The first piece of advice is that the choice of <strong>task</strong> and <strong>shared data</strong> interact strongly with the choice of parallel algorithm. We review those concepts more precisely here, in light of the overall context of the TOP-C model. <p> <p> <dl compact> <p> <a name = "I0"></a> <dt><strong>task</strong>:<dd> A task is a function that that takes a single argument, <var>taskInput</var>, reads certain globally shared data, the <strong>shared data</strong>, and computes a result, the <var>taskOutput</var>. Hence, given the same task input and the same shared data, a task should always compute the same task output. The TOP-C model implements this concept through the <code></code><var>DoTask</var><code>()</code> application routine. In the TOP-C model, this rule is bent to accommodate caching of private data to efficiently handle a <code>REDO_ACTION</code> (see Section <a href="CHAP005.htm#SECT006">Caching slave task outputs (ParSemiEchelonMat revisited)</a>), or to accommodate a <code>CONTINUATION_ACTION()</code> (see Section <a href="CHAP003.htm#SECT006">The GOTO statement of the TOP-C model</a>). <p> <a name = "I1"></a> <dt><strong>shared data</strong>:<dd> The <strong>shared data</strong> is globally shared data. It should be initialized before entering <code>MasterSlave()</code>. The shared data is never explicitly declared. However, it is best for the application programmer to include a comment specifying the shared data for his or her application. The TOP-C model poses certain restrictions on what legally constitutes shared data. The shared data must include enough of the global data (variables that occur <strong>free</strong> in the <code></code><var>DoTask</var><code>()</code> procedure) so that the task output of <code></code><var>DoTask</var><code>()</code> is uniquely determined by the task input and the shared data. However, the shared data must <strong>not</strong> include any variables whose values are modified outside of the application routine <code></code><var>UpdateSharedData</var><code>()</code>. Also, the shared data is updated <strong>non-preemptively</strong>, in the sense that a slave process will always complete its current task before reading a newly arrived message that invokes <code></code><var>UpdateSharedData</var><code>()</code>. If a slave privately caches data for purposes of a <code>REDO_ACTION</code> or <code>CONTINUATION_ACTION()</code>, such data is explicitly not part of the shared data. <p> </dl> <p> <p> <a name="SECT001"><h2>6.1 Tracing and Debugging</h2></a> <p><p> In testing a program using <code>MasterSlave()</code>, a hierarchy of testing is suggested. The principle is to test the simplest example first, and then iterate to more complex examples. When a stable portion of the program is ready for testing, the following sequence of tests is suggested: <p> <p> <dl compact> <p> <dt>sequential<dd> Replace <code>MasterSlave()</code> by <code>SeqMasterSlave()</code> (see definition below) and see if the program performs correctly. <code>SeqMasterSlave()</code> will run only on the master, without sending any messages, and so the full range of sequential debugging tools is available. <p> <dt>one slave<dd> Restore <code>MasterSlave()</code> and set up the <code>procgroup</code> file to have only one slave process (one line, <code>local 0</code>, and one line <code>localhost ...</code>). Initially test with no <code>taskAgglom</code> parameter for <code>MasterSlave()</code>, and then test with the full set of parameters. <p> <dt>two slaves<dd> Same advice as for one slave, but two lines: <code>localhost ...</code> <p> <dt>many slaves<dd> Full scale test, both without and with <code>taskAgglom</code>. <p> </dl> <p> <a name = "SSEC001.1"></a> <li><code>ParTrace V</code> <p> A second easy testing strategy is to set <code>ParTrace</code> to <code>true</code>. (This is the default value.) This causes all <var>taskInput</var>s, <var>taskOutput</var>s, and non-trivial actions (actions other than <code>NO_ACTION</code>) to be displayed at the terminal. The information is printed in the same sequence as seen by the master process. <p> Another ``cheap'' debugging trick is to inspect the values of global variables on the slave after it has been thrown out of the <code>MasterSlave()</code> procedure. The following code demonstrates by interrogating the sum of the variables <code>x</code> and <code>y</code> on slave number 2. <p> <pre> gap> SendRecvMsg("x+y;\n", 2); </pre> <p> This is useful to inspect cached data on a slave used for a <code>REDO_ACTION</code> or <code>CONTINUATION_ACTION()</code>. It may also be useful to verify if the shared data on the slave is the same as on the master. If the slave process is still inside the procedure <code>MasterSlave()</code>, then from within a break loop on the master, you may also want to interactively call <code></code><var>DoTask</var><code>( </code><var>testInput</var><code> )</code> to determine if the expected <var>taskOutput</var> is produced. <p> If the master process is still within <code>MasterSlave()</code>, then it is useful to execute <code></code><var>DoTask</var><code>()</code> locally on the master process, and debug this sequentially. <p> There is also the time-honored practice of inserting print statements. Print statements ``work'' both on the master and on the slaves. If ParTrace produces too much output, or not the right kind of information, one can add print statements exactly where one needs them. As with any UNIX debugging, it is sometimes useful to include a call to <code>fflush(stdout)</code> to force any pending output. <font face="Gill Sans,Helvetica,Arial">ParGAP</font> binds this to: <p> <a name = "SSEC001.2"></a> <li><code>UNIX_FflushStdout() F</code> <p> This has the same effect as the UNIX <code>fflush(stdout)</code>. There may be pending output in a buffer, that UNIX delays printing for efficiency. Printing any remaining output in the buffer is forced by this command. A common sequence is: <code>Print("information"); UNIX_FflushStdout();</code>. Note also that when the slave prints, there are ``two'' standard outputs involved. You may also want to include a call to <code>UNIX_FflushStdout()</code> on the master to force any pending output that originated on a slave. Finally, you should be conscious of network delays, and so a print statement in a slave process will typically take longer to appear than a print statement in the master process. <p> <a name = "SSEC001.3"></a> <li><code>SeqMasterSlave( </code><var>SubmitTaskInput</var><code>, </code><var>DoTask</var><code>[, </code><var>CheckTaskResult</var><code>[,</code><var>UpdateSharedData</var><code>[, </code><var>taskAgglom</var><code> ]]] ) F</code> <p> If a bug is exhibited even in the context of a single slave, then the code is ``almost'' sequential. In this case, one can test further by replacing the call to <code>MasterSlave()</code> by a call to <code>SeqMasterSlave()</code>, and debug in a context that involves zero messages and no interaction with any slave. It can also be helpful to carry out initial debugging in this context. Note that in the case of a single slave, which is what <code>SeqMasterSlave</code> emulates, <code>IsUpToDate()</code> will always return true, and so most applications will not call for a <code>REDO_ACTION</code>. <p> <p> <a name="SECT002"><h2>6.2 Efficiency Considerations</h2></a> <p><p> <a name = "I2"></a> There are two common reasons for loss of efficiency in parallel applications. One is a lack of enough tasks, so that some slaves are starverd for work while waiting for the next task input. A second reson is that the ratio of communication time to compilation time is too large. The second case, poor communication efficiency, is the more common one. <p> The communication efficiency can be formally defined as the ratio of the time to execute a task by the time taken for the master to send an initial task message to a slave plus the time for the slave to send back a result message. A good way to diagnose your efficiency is to execute <code>MasterSlaveStats()</code> after executing <code>MasterSlave()</code>. <p> <a name = "SSEC002.1"></a> <li><code>MasterSlaveStats() F</code> <p> This function currently returns statistics in the form of a list of two records. The first record provides the global information: <p> <p> <dl compact> <p> <dt><code>MStime</code><dd> total runtime (as measured by <code>Runtime()</code>), <p> <dt><code>MSnrTasks</code><dd> total number of tasks (not including <code>REDO</code> or <code>UPDATE</code>, <p> <dt><code>MSnrUpdates</code><dd> total number of times action <code>UPDATE</code> was returned, and <p> <dt><code>MSnrRedos</code><dd> total number of times action <code>REDO</code> was returned. <p> </dl> <p> The second record provides per-slave information: <p> <p> <dl compact> <p> <dt><code>total</code><dd> total time spent on tasks, not including <code>UpdateSharedData()</code>, <p> <dt><code>num</code><dd> number of initial tasks, <code>REDO</code> and <code>CONTINUATION()</code> actions, <p> <dt><code>ave_ms</code><dd> the value of <code>QuoInt(1000*total,num)</code> in <font face="Gill Sans,Helvetica,Arial">GAP</font>, and <p> <dt><code>max</code><dd> maximum time spent on a task, in seconds. <p> </dl> <p> Note that for purposes of the per-slave statistics, separate time intervals are recorded for each initial task, <code>REDO</code> action, and <code>CONTINUATION()</code> action. The time for <code>UpdateSharedData()</code> is not included in these statistics. This is because after an <code>UPDATE</code> action, the slave does not reply to the master to acknowledge when the update was completed. <p> <strong>Notes:</strong> <p> Poor communication efficiency is typically caused either by too small a task execution time (which would be the case in the example of section or too large a message (in which case the communication time is too long). We first consider execution times that are too small. <p> On many Ethernet installations, the communication time is about 0.01 seconds to send and receive small messages (less than 1 Kb). Hence the task should be adjusted to consume at least this much CPU time. If the naturally defined task requires less than 0.01 seconds, the user can often group together several consecutive tasks, and send them as a single larger task. For example, in the factorization problem of section, one might modify <code></code><var>DoTask</var><code>()</code> to test the next 1000 numbers as factors and modify <code></code><var>SubmitTaskInput</var><code>()</code> to increment <code>counter</code> by 1000. <p> There is another easy trick that often improves communication efficiency. This is to set up more than one slave process on each processor. This improves the communication efficiency because during much of the typical 0.01 seconds of communication time the CPU has off-loaded the job onto a coprocessor. Hence, having a second slave process running its own task on the CPU while a first process is concerned with communication allows one to <strong>overlap communication with computation</strong>. <p> We next consider the case of messages that are too large. In this case, it is important to structure the problem appropriately. The task architecture is intended to be especially adaptable to this case. The philosophy is to minimize communication time by duplicating much of the execution time on each processor. <p> After the initial data structure has been built, it will usually be modified as a result of the computation. In order to again minimize communication, the result of a task, which is typically passed to <code></code><var>UpdateSharedData</var><code>()</code>, should consist of the minimum information needed to update the global data structure. Each process can then perform this update in parallel. <p> <p> <a name="SECT003"><h2>6.3 Checkpointing in TOP-C</h2></a> <p><p> Any long-running computation <strong>must</strong> be concerned with checkpointing. The TOP-C model also provides a simple model for checkpointing. The key observation is that the master process always has the latest state of the computation, and the information in the master process is sufficient to reconstruct any ongoing computation. Any application may take advantage of this by checkpointing the necessary information either in the application routine, <code></code><var>SubmitTaskInput</var><code>()</code> or in <code></code><var>CheckTaskResult</var><code>()</code>. <p> A simple way to checkpoint is to record: <p> <ul> <p> <li> the current data in the TOP-C shared data; <p> <li> any private global data residing only in the master process; and <p> <li> the inputs to any tasks that are still pending. <p> </ul> <p> This model for checkpointing assumes that your program has no <code>CONTINUATION()</code> actions. If you use <code>CONTINUATION()</code> actions, then you may require a more complex model for checkpointing. <p> <a name = "SSEC003.1"></a> <li><code>MasterSlavePendingTaskInputs() F</code> <p> This function returns a <font face="Gill Sans,Helvetica,Arial">GAP</font> list (with holes) of all pending task inputs. If slave <var>i</var> is currently working on a task, index <var>i</var> of the list will record that task. If slave <var>i</var> is currently idle or executing <code></code><var>UpdateSharedData</var><code>()</code>, then there will be a hole at index <var>i</var>. This function is available for use within either the application routine <code></code><var>SubmitTaskInput</var><code>()</code>, or <code></code><var>CheckTaskResult</var><code>()</code>, as specified in the parameters of your call to <code>MasterSlave()</code>. (Of course, your application may be using a name other than <code></code><var>SubmitTaskInput</var><code>()</code> or <code></code><var>CheckTaskResult</var><code>()</code> in the parameters of <code>MasterSlave()</code>.) <p> <p> <a name="SECT004"><h2>6.4 When Should a Slave Process be Considered Dead?</h2></a> <p><p> An important question for long-running computations, is when to decide that a slave process is dead. For our purposes, <strong>dead</strong> is not a well-defined concept. If a user on the remote machine decides to re-boot, it is clear that any slave processes residing on that machine should be declared dead. However, suppose there is temporary congestion on the network making the slave unavailable. Suppose that another user on the remote machine has started up many processes consuming many resources, and the TOP-C slave process is being starved for CPU time or for RAM. Perhaps the most difficult case of all to decide is if one particular TOP-C task requires ten times as much time as all other tasks. This last example is conceivable if, for example, each task consists of factoring a different large integer. <p> Hence, our implementation of the TOP-C model will employ the following <strong>heuristic</strong> in a future version, to decide if a task is dead. You may wish to employ this heuristic now, if you have a demanding application. We use the <font face="Gill Sans,Helvetica,Arial">ParGAP</font> function, <code>UNIX_Realtime()</code>, to keep track of how much time has been spent on a task (based on ``wall clock time'', and not on CPU time). If a task has taken <code>slaveTaskTimeFactor</code> times as much time as the longest task so far, then it becomes a candidate for being declared dead. The <font face="Gill Sans,Helvetica,Arial">GAP</font> variable <code>slaveTaskTimeFactor</code> is initially set to the default value of 2. <p> Once a slave process becomes a candidate for being declared dead, <code>MasterSlave()</code> will create a second version of the same task, with the same task input as the original task. <code>MasterSlave()</code> will then record which task finishes first. If the original version finishes first, then the second version of the task is ignored, and the slave process executing the original task is no longer considered a candidate for death. <p> If, however, the second version of the task finishes before the original version, then the time for the second task is recorded. Further, the output from the second task will be used, and any output resulting from the original task will be ignored. <code>MasterSlave()</code> then periodically checks until the ration of the time spent so far on the original version of the task is at least <code>slaveTaskTimeFactor</code> times greater than the time spent on the second version of the task, then the process executing the original version of the task is then declared dead. No further messages from the process executing original task will be recognized and no further messages will be sent to that slave process. <p> A future version of this distribution will include direct support for this heuristic. A customized version of it may be used now, by taking advantage of the <font face="Gill Sans,Helvetica,Arial">ParGAP</font> routine, <code>UNIX_Realtime()</code>. In addition, a future version of this distribution may include the ability to start new slave processes in an ongoing computation. The reference <a href="biblio.htm#CG98"><cite>CG98</cite></a> describes how this was done in a C implementation, and why this concept fits naturally with the TOP-C model. <p> <p> [<a href = "chapters.htm">Up</a>] [<a href ="CHAP005.htm">Previous</a>] [<a href ="CHAP007.htm">Next</a>] [<a href = "theindex.htm">Index</a>] <P> <address>ParGAP manual<br>November 2001 </address></body></html>