A Snapshot object is created, and its id passed to the data structures whose combined state is of interest. The data structures then register with the Snapshot object the status of all operations that are involved in the snapshot. More than one Snapshot objects can be used for the same operation to detect different properties.
The life time of a muator operation consists of three stages: suspended, meaning the operation has not started execution and thus has not caused any side effect, executing, meaning the operation has started execution and side effect may have been commited, and completed, meaning the operation has finished execution. Operations that lend themselves to snapshots must guarantee to complete eventually once they enter the executing stage. They must also register their status with the snapshot objects using Snapshot_suspend, Snapshot_execute, and Snapshot_complete.
To take a snapshot, the programmer first freezes the data structures using the snapshot object. Upon completion of the freeze operation, all muators that begin execution before it are completed, and all operations concurrent with it are either suspended or completed (i.e., they cannot be in the executing stage). Once frozen, the data structures cannot be mutated by the chosen operations until they are explicitly unfrozen, and the programmer can take the snapshot by reading the distributed states on all processors in any order. When the snapshot is done, the programmer unfreezes the data structures and allows suspended operations to proceed as usual.
Snapshot_all_create, Snapshot_all_destroy, Snapshot_all_freeze, Snapshot_all_unFreeze and Snapshot_all_sync are split-phase operations, and they must be called by one thread per processor on all processors. Split-phase operations return OK if it has completed upon return (i.e., no need to create a thread to wait for the the result), and WAIT otherwise.
Create a Snapshot object with id id. Use scheduler to deposit all threads created by the object. Increment the counter ctr after the object is created.
Destroy the Snapshot object id by awaking all operations suspended by the object and freeing up its storage. Increment the counter ctr upon completion.
Snapshot_all_destroy currently does nothing.
Freeze all data structures (with respect to the chosen operations) managed by the Snapshot object id. Snapshot_all_freeze completes on a processor when the number of operations that have started executing on the processor matches the number of operations completed on the processor. Increment the counter ctr upon completion.
When Snapshot_all_freeze completes, all operations that have started executing before calling Snapshot_all_freeze are completed, and all operations called after Snapshot_all_freeze completes are suspended until Snapshot_all_unFreeze is called. Operations concurrent with Snapshot_all_freeze are either suspended or completed.
The data structure cannot be in a frozen state. There cannot be other outstanding Snapshot_all_freeze operations. All executing operations must guarantee to complete eventually.
Unfreeze all data structures (with respect to the chosen operations) managed by the Snapshot object id. All operations suspended by the Snapshot object are re-enabled (i.e., their continuation threads are activated).
Snapshot_all_unFreeze must be called after the corresponding Snapshot_all_freeze completes. The objects must be in a frozen state.
Perform a Snapshot_all_freeze operation followed by a barrier and then a Snapshot_all_unFreeze operation. Increment the counter ctr upon completion. Snapshot_all_sync completes when all operations that started executing before it is called have completed. It is usually used in bulk synchronous programs to make sure all pending operations have completed before entering the next phase. Using Snapshot_all_sync eliminates the communication overhead for acknowledging each operation.
Snapshot_all_sync cannot be called when the object is not frozen or it is being frozen. For Snapshot_all_sync to have the expected behavior, the operations involved cannot wait for conditions other than those imposed by Snapshot_query.
Determine if the caller can proceed to mutate the data structures managed by the Snapshot object id without causing inconsistency in the snapshot. If not, Snapshot_query returns WAIT and the caller must suspend itself (by creating a continutation thread and registering its status with Snapshot_suspend) until awaken by the Snapshot object. Otherwise, Snapshot_query returns OK, and the caller may register its new status with Snapshot_execute and enter the ``executing'' stage
Register with the Snapshot object id that an operation has been suspended with the continuation thread tid. The thread tid is awaken when the operation can execute. The thread tid is attached to a counter maintained by the Snapshot object, and must be activated by the object only. The thread is awaken when the program invokes Snapshot_unFreeze on the object.
Once awaken, the thread tid must query the snapshot object again {using Snapshot_query) before changing status.
Register with the Snapshot object id that an operation has started executing (and possibly mutating the object), and is expected to complete and call Snapshot_complete on processor proc
Snapshot_query must be called before Snapshot_execute to ensure it is safe to issue the operation. Once started, the operation must eventually complete to ensure Snapshot_freeze completes.
Register with the Snapshot object id that an operation has completed on the calling processor.
Snapshot_complete must be called after Snapshot_execute for the same operation to ensure the consistency of snapshot. The processor that calls Snapshot_complete must be the processor specified in the corresponding Snapshot_execute call.