Description

Description

A Multipol program starts by executing the function MultipolMain on every processor. MultipolMain then calls Rts_start(firstThread), which transfers control to the Multipol top-level sheduler after some initialization steps. firstThread is the first thread to execute, and it normally starts other threads in the programs. The runtime system keeps scheduling threads until some thread calls Rts_exit to exit the runtime system. The call transfers control back to the remainder of MultipolMain.

Multipol threads appear to be atomic. All threads execute until completion. Long-live threads can be built on top of atomic threads via a sequence of continuation thread. Threads do not migrate - they are used latency hiding, not for load balance.

Multipol threads can be scheduled by schedulers supplied by the programmer (called customized schedulers). A scheduler is a data structure with two operations: deposit and select. When a thread is created, the user specifies the scheduler to use. When a thread is enabled for execution, the runtime system calls its designated deposit operation to store the thread handle. The top level scheduler calls the select operation periodically to choose the next thread to execute. Multipol provides a variety of default schedulers such as FIFO queues. It also provides a preemptive scheduler which allows threads to preempt running threads to reduce overhead. The preemptive scheduler must be used with caution so that atomicity is maintained and no starvation is incurred.

Different threads on the same processor share the same local memory. Threads running on different processors communicate via asynchronous messages. A thread can transfer local data to a preallocated buffer on a remote processor and invoke a handler thread to process the message. It can also invoke a remote thread with an arbitrary number of arguments. The Multipol runtime system attempts to optimize communication performance, using knowledge of the machine architecture and the semantics of the schedulers. For example, a preemptive thread can usually execute as an active message if its arguments fit in a machine packet. For machines with high communication start-up overheads, the runtime system automatically aggregates small messages into large physical messages to improve bandwidth.

For data structures and applications that are not multi-threaded, the runtime system provides a SPMD execution environment between the Rts_enable and Rts_disable calls. The user must call Rts_pollRts periodically to ensure progress of all data structures, including those that are multithreaded.

Rts_Store, Rts_Store1, Rts_Get, Rts_Put are split-phase operations. Split-phase operations return OK if it has completed upon return (i.e., no need to create a thread to wait for the the result), and WAIT otherwise. Non-split-phase primitives whose return type is Flag return OK if they succeed, and FAIL if exceptions occur.

Except for Rts_start, Rts_enable, Rts_disable, Rts_registerInit, Rts_registerCleanUp, and Rts_registerSelect, all primitives must be called only when the top-level scheduler is running.

Rts_numProc, Rts_myProc

Rts_numProc and Rts_myProc are constants that are defined as the number of processors in the system and the identity of the local processor (ranging from 0 to Rts_numProc - 1), respectively.

Rts_outputStats

Setting Rts_outputStats to 1 on a processor enables its statistics output. Setting it to zero disable the statistics output. The default is 0.

Rts_start

Effect

Start the top-level scheduler of the Multipol runtime system. Rts_start first calls the initialization routines registered by the user using Rts_registerInit. It then executes the function f as the first thread. After f completes, the top-level scheduler periodically invokes all customized schedulers registered by the user using Rts_registerSelect. The top-level scheduler terminates when some thread calls Rts_exit, after which it calls the clean-up routines registered by the user using Rts_registerCleanUp. Finally, Rts_start returns control to its caller. Rts_start returns FAIL if there is an error in the initialization step, and OK otherwise.

All threads scheduled by a fair customized scheduler are guaranteed to execute eventually.

Requires

Only one top-level scheduler can be active at any given time. Only one of Rts_start and Rts_enable can occur in the program.

Post-requires

Rts_exit must be called eventually to exit the top-level scheduler.

Known bugs

There may be residual messages left in the network after the top-level scheduler exits. Such messages may cause problems if they access data structures that have been deallocated by the clean-up routines. The current version attempts to avoid this problem by entering a time-out loop to drain all outstanding messages before the top-scheduler calls the clean-up routines. However, the user is advised to call Rts_start at most once in the program.

Rts_exit

Effect

Exit the top-level scheduler of the Multipol runtime system. The runtime system calls the clean-up routines registered by the user using Rts_registerCleanUp before it terminates. When the top-level scheduler terminates, there may still be outstanding messages that have not been received, or threads that have not started execution.

Requires

Rts_exit should be called at most once on one processor only.

Rts_enable

Enable a SPMD program to use the Multipol runtime system primitives. It has the same effect has Rts_start, but it returns immediately after the initialization steps without executing any thread.

Requires

Rts_enable should be called at most once per processor in the program. Only one of Rts_start and Rts_enable can occur in the program.

Rts_pollRts

Poll all customized schedulers times times to execute their threads.

Requires

In a SPMD program, Rts_pollRts must be called periodically to ensure progress.

Known bugs

times is currently ignored, and each poll executes at most one thread.

Rts_disable

Disable the SPMD program from using the Multipol runtime system. It waits until the runtime system terminates (i.e., some thread calls Rts_exit). It then performs the cleanup steps as in the last part of Rts_start.

Requires

Rts_disable cannot be called before Rts_enable. It should be called at most once per processor in the program.

Rts_registerInit

Effect

Register the initialization routine f with the runtime system. f is called by the runtime system on the local processor in the initialization step before the first thread is executed. The initilization routines are called in the order they are registered. f must return OK if initialization succeeds. It must return FAIL if initialization fails, in which case the top-level scheduler returns immediately with an error.

Rts_registerCleanUp

Effect

Register the clean-up routine f with the runtime system. f is called by the runtime system in the clean-up step before the top-level scheduler returns. The clean-up routines are called in the order they are registered. f must return OK if initialization succeeds. It must return FAIL if initialization fails.

Rts_registerSelect

Effect

Register the select operation f of a customized scheduler. A customized scheduler is a data structure with two operations: deposit and select. The deposit operation takes one argument Thread *tid, which is the handle of the thread to be stored. When a thread is enabled for execution via Rts_avtivateThread or Rts_invokeRemoteThread, the runtime system calls its designated deposit operation (specified at thread creation) to store the thread handle in the scheduler's internal data structure such as a FIFO queue. The select operation takes one argument Thread **tidPtr. It removes one of the thread handles in its internal data structure, stores it in *tidPtr, and returns OK. It returns FAIL if there is no thread to schedule. The runtime system invokes f periodically to choose the next thread to execute. Quantum is a hint for the top-scheduler as to how often f should be called. Its value ranges from 1 to 10. There is no restriction on the number of deposit or select operations for the same customized scheduler.

Known bugs

The argument Quantum is not used in the current version. There can be at most 16 select operations.

Rts_depositFIFO, Rts_depositPreempt, Rts_depositUrgent

The deposit operations of three thread schedulers provided by the runtime system. See Rts_registerSelect for the definition of a scheduler.

Rts_depositFIFO is a first-in-first-out scheduler implemented as a FIFO queue.

Rts_depositPreempt is a preemptive scheduler which schedules threads that may preempt the running thread. Preemption occurs when calling Rts_enableThread, Rts_invokeRemoteThread, or other primitives that poll the network. Rts_depositPreempt can be used only if such preemption preserve the atomicity of thread execution. When preemption does not occur, threads deposited by Rts_depositPreempt are scheduled in the first-in-first-out order. Thread deposited by Rts_depositPreempt cannot call Rts_continuteThread, touch the network (using the communication primitives or explicit polling), or directly modify the thread context.

Rts_depositUrgent is a first-in-first-out scheduler for ``urgent threads''. Such threads include the runtime system's internal message handlers. The runtime system always executes threads in the FIFO queue of Rts_depositUrgent before other threads. Incorrect use of Rts_depositUrgent may lead to starvation.

The runtime system performs extensive optimizations to reduce the overheads caused by multi-threading, using knowledge of the machine architectures and the semantics of the schedulers. For example, small preemptive threads invoked by Rts_invokeRemoteThread can execute as a local function call if it is destined to the local processor, and they can usually execute as active messages if the target processor is remote.

Requires

The data structures must substitute the suitable select operation if the user supplies a non-safe scheduler, such as Rts_depositPreempt or Rts_depositUrgent, which may break the implementation of the data structure.

Rts_allocThread

Effect

Create a thread and return its handle in *tidPtr. The thread is deposited by scheduler when enabled. It executes the function f with one argument of type void*, which points to its context. The thread context is copied from the buffer context, which has contextSize bytes. Threads created by Rts_allocateThread are not eligible for execution until they are enabled using Rts_enableThread.

Rts_continueThread

Effect

Create a continuation thread of the running thread and return its handle in *tidPtr. The new thread shares everything with the running thread (including the thread handle and context) except for the function it executes, which is f. Its effect is the same as Rts_allocThread, but is more efficient.

Requires

The running thread that calls Rts_continueThread cannot be a preemptive thread.

Rts_enableThread

Effect

Enable the thread with handle tid for execution. The thread's designated deposit operation is called to store the thread handle. If its designated scheduler is a fair scheduler, the thread is guarantee to execute eventually (unless the top-level scheduler terminates before it does).

Rts_allocCounter

Effect

Create a new counter with the initial value value, and return its handle in *ctr.

Rts_deAllocCounter

Effect

Deallocate the counter with handle ctr.

Rts_initCounter

Effect

Initialize the counter ctr and set its value to value. Threads that wait on the value of ctr before the call will not be enabled for execution, that is, they are lost forever. Therefore, Rts_initCounter should only be used to initialize statically allocated counters for the first time.

Rts_resetCounter

Effect

Reset the value of the counter ctr to value. Enable all threads that waits on a value no greater than value.

Rts_waitCounterGE

Effect

Register the thread tid with the counter ctr. Enable the thread for execution when the counter's value reaches value. Enable the thread immediately if the counter's current value is no smaller than value.

Requires

The thread tid can be registered with at most one counter.

Post-requires

Once registered, the thread tid can be enabled by the counter only.

Rts_waitCounterNext

Effect

Register the thread tid with the counter ctr. Enable the thread for execution when the counter's value is greater than its current value. Once registered, the thread tid can be enabled by the counter only.

Requires

The thread tid can be registered with at most one counter.

Rts_readCounter

Effect

Return the current value of the counter ctr in *valuePtr.

Rts_incrCounter

Effect

Increment the counter ctr by one. Enable all threads that waits on a value no greater than its new value.

Rts_invokeRemoteThread

Effect

Create and enable a thread on the remote processor proc. The new thread is deposited by scheduler, and it executes the function f. The thread's context is copied from the buffer context which is contextSize bytes in size. The call returns immediately without waiting for the thread to be created. The runtime system guarantees that the thread eventually executes on proc (unless the top-level scheduler terminates before it does). Rts_invokeRemoteThread requires communication when proc is remote, and its messages are aggregated automatically by the runtime system.

Known bugs

In the current version, the thread context cannot contain fields that must be double-word aligned.

Rts_invokeRemoteThread_4

Effect

Create and enable a thread on the remote processor proc. The new thread is deposited by Rts_depositPreempt, and it executes f with four integer arguments. On machines that support fast, small messages (such as CM5), Rts_invokeRemoteThread_4 can usually be converted to an active message. Like Rts_invokeRemoteThread, its messages are also aggregated automatically by the runtime system when necessary.

Requires

The new thread must be a preemptive thread with no continuation.

Rts_Store, Rts_Store1

Effect

Transfer size bytes from the local address buf to the location pointed to by the global pointer putAddr, which has the following type:

typedef struct GlobalPtr {   /* the global pointer type */
    int proc;                /* processor id */
    void *localAddr;         /* address on proc */
} GlobalPtr;

When the transfer completes, a thread is created and enabled (deposited using Rts_depositUrgent). The new thread executes f with one pointer argument to the following type:

typedef struct StoreThreadArg { /* argument of Store handler */
    char *buf;                  /* Store address = putAddr.localAddr */
    int size;                   /* size of transfer */
    void *arg;                  /* arg of Store1 */
} StoreThreadArg;

The counter ctr is incremented when the local buffer buf can be safely reused. The completion of Store, however, does not imply the transfer is completed.

Rts_Store1 is the same as Rts_Store1 except that it takes one more argument (arg).

Requires

The addresses buf and putAddr must be word-aligned. The memory block at buf cannot be modified until Store completes. The memory block at putAddr cannot be modified until the transfer completes. The transfer will complete eventually after Store returns.

Rts_Get

Effect

Transfer size bytes from the location pointed to by the global pointer getAddr to the local address buf (see Rts_Store for the definition of a global pointer). Increment the counter ctr when the tranfer completes.

Requires

The addresses buf and getAddr must be word-aligned.

Post-requires

The buffers pointed to by buf and getAddr cannot be modified until Get completes.

Rts_Put

Effect

Transfer size bytes from the local address buf to the location pointed to by the global pointer putAddr (see Rts_Store for the definition of a global pointer). Increment the counter ctr when the tranfer completes. The buffers pointed to by buf and putAddr cannot be modified until Put completes.

Requires

The addresses buf and putAddr must be word-aligned.

Rts_setAggregation

Effect

Set the message aggregation size to size. The runtime system aggregate small messages into large physical messages to improve communication bandwidth. A physical message is sent when its size exceeds size, or when the processor is idle, or when ``much'' computation has elapsed since the last communication.

Rts_getTimeCoarse

Effect

Return the current time in seconds. The time returns is usually coarse in granularity, and the call may have non-negligible cost.

Rts_pollNetwork

Effect

Poll the network interface to receive incoming messages. The runtime system calls Rts_pollNetwork once after every thread execution. The user can insert Rts_pollNetwork in long running threads to avoid network congestion.

Rts_panic

Effect

Print the error message s and perform error handling.

Known bugs

In the current version, all errors are simply printed without further processing.



Chih-Po Wen
Wed Sep 13 23:59:20 PDT 1995