Killing Threads Considered Dangerous

Claudio Fleiner, Jerry Feldman, David Stoutamire
fleiner@, jfeldman@, davids@icsi.berkeley.edu
International Computer Science Institute
Berkeley, USA

Introduction

The parallel Sather (pSather) group continues its efforts to make parallel OO programming safe, efficient and relatively easy. Since the last POOMA meeting we have learned many lessons, some of which seem to be of general interest. All Sather releases since 1.0.6 in May 1995 have included pSather. The latest version runs on Solaris SMPs, networks of these linked by Myrinet or TCP/IP, and the Meiko CS-2. It is currently being used in a parallel computing course at Berkeley and for ICSI projects.

Sather is an object oriented language that has parameterized classes, object-oriented dispatch, statically-checked strong (contravariant) typing, separate implementation and type inheritance, multiple inheritance, garbage collection, iteration abstraction, higher-order routines and iters, exception handling, assertions, preconditions, postconditions, and class invariants. pSather is a parallel extension that addresses non-uniform-memory-access multiprocessor architectures but presents a shared memory model to the programmer. It extends serial Sather with threads, synchronization and data distribution. Unlike actor languages, multiple threads can execute in one object. It offers several synchronization mechanisms like futures, gates, mutex, reader/writer locks, barrier synchronization, rendezvous and a disjunctive lock statement.

The talk will briefly introduce the key features of pSather and the new modifications. These include disjunctive locks, visitor/ mutator protection and iterator yields from within a lock statement.

One of the most interesting lessons of the last year has been a major revision of how thread termination is handled. There was a "gate.clear" command that terminated all threads attached to a gate, regardless of their state. We eventually convinced ourselves that there is no way to do this that guarantees correct semantics. Meanwhile, we had been discussing whether it was worth adding disjunctive locks to the language. It turns out that disjunctive locks cleanly handle all the cases in which we were using asynchronous thread termination. We believe that both the negative and positive sides of this finding are general, but will be very interested in the experience of other groups.

Discussion of some new pSather features

A. Disjunctive Lock

pSather has a lock statement that allows a thread to safely acquire one or more locks at a time, guaranteeing correct unlocking and reasonable properties of fairness and deadlock avoidance even in the face of exceptions. The disjunctive lock is an extension that allows a thread to wait for one of several locks and, depending on the locks it acquired, execute some statements. Similar constructs are used in Ada (select statement) and in Occam (ALT statement), although both statements are used to select data from one or more input channels, like the select(2) system call in UNIX. The disjunctive lock can be used together with gates (a pSather object that implements a queue and can be used to synchronize threads) to get the same effects, but there are many other ways to use it. The general syntax of this lock statement is as follows:

	lock
	[guard condition] when lock_expr [, lock_expr] then
	   statements
	[guard condition] when lock_expr [, lock_expr] then
	   statements
	 ...
	[else statements]
	end

The lock_expr is supposed to return an object of type $LOCK. The standard libraries have classes for mutual exclusion locks (MUTEX), several flavours of reader/writer locks, barrier locks and rendezvous locks. As these synchronization objectst are standard pSather classes, one can add new lock types if necessary. When a thread enters a lock statement, it evaluates all guard conditions. All when branches with guard conditions that evaluate to false are ignored. The thread will then select a branch for which it can acquire all locks. If no branch is available the thread executes the optional else part, if present, or blocks. The system avoids thread starvation: if a thread can run, it will eventually do so. It also guarantees that eventually each of the different when clauses will be selected if its locks are free and their respective guard condition is true.

A thread that waits for input on one of several different gates (gates are FIFO queues with a blocking dequeue) could be written as follows:

	loop
	   lock
	   when gate1.not_empty then work_on(gate1.dequeue);
	   when gate2.not_empty then work_on(gate2.dequeue);
	   end;
	end;

Note that gate1.not_empty does not return a boolean, but an object of the abstract type $LOCK. In this case it returns a lock object which provides mutual exclusion , but which cannot be locked until a certain condition (here: gate is not empty) holds. This incorporates an older insight; signal conditions (not_empty) must be lockable to ensure that the condition still holds when the signalled thread resumes. As long as the thread holds this lock, no other thread can enqueue or dequeue a value from the gate.

B. Thread Termination

During the last year we convinced ourselves that there is no clean way to asynchronously terminate a thread in an object oriented language. Simply stopping a thread is clearly not an option, as such a thread could leave some broken objects behind, like a linked list where only some of the pointers have been updated. If pSather had such a construct, it would also need a way to protect against it. However, a thread that shields itself against the kill signal and calls a function that blocks (like gate.dequeue) cannot be killed while blocked and might never be unblocked.

A programmer that relies on the fact that a thread can be killed would have to know exactly which functions can block and make sure that the thread does not shield itself against kill signals while calling those functions. Shielding makes code reuse difficult too, as one has to know for each function if it is supposed to be called with or without shielding.

Of course an interuptable thread can poll a flag while it is running; the problem is terminating a blocked thread. One idea is to raise a special exception in this thread. This removes some of the problems above, but it is still not easy to write correct programs. Besides, such a program can only use libraries that are aware of the fact that a clear exception can happen and that shield themselves against it to make sure that no broken objects are left behind. Take for example the following code fragment:

	1.  protect
	2.     e:=gate.dequeue;
	3.  when CLEARED_EXCEPTION then
	4.     clean_up;
	5.  end;

If the gates are not designed to handle clear exceptions correctly, the gate may be in a unsafe state after the exception occurred and hence may not be used anymore (neither by this thread nor by another thread). If it is written correctly, we do not know in line 4 if a value has been dequeued, and even if we knew that it had been dequeued (by checking the size of the gate, for example), there is no guarantee that e holds this value, as the exception could have occurred after gate.dequeue returned, but before the return value has been assigned to e. To do it correctly one has to write

	 1.  got_value:=false;
	 2.  protect
	 3.     lock gate.not_empty then
	 4.        shield
	 5.           e:=gate.dequeue; got_value:=true;
	 6.        end;
	 7.        work_on(e);
	 8.     end;
	 9.  when CLEARED_EXCEPTIONthen
	10.     if got_value(e)then safe_e end;
	11.  end;

Note that although we know in line 10 whether e has been dequeued or not, we have no information on how far work_on(e) has proceeded. To get this information we would have to use additional flags inside work_on(). This example shows that although it is possible to write code that works with CLEARED_EXCEPTION it is far from easy.

Some additional problems occur with bound routines (a Sather construct similar to closures and function pointers in C). Suppose we have a clever search routine to find database records: As an input parameter it gets a bound routine that decides for each record if it is the one being sought. This search routine could start several threads that run through the database in parallel and stop as soon as any one has found a matching record. Unfortunately this works only if either the bound routine is shielded against the clear exception (which works only if it will never block) or has been designed to correctly catch the exception. As the compiler has no way to check for either of those conditions, it cannot help in detecting errors and hence the programmer may easily introduce hard to find bugs.

Not only would it be troublesome to write all libraries in a way that they are safe to use with a CLEARED_EXCEPTION, it would also slow them down whenever this functionality is not needed (the protect as well as the shield command need some instructions and the former has to store the current frame information to make it possible to jump back if an exception happened). Even though the killing of a thread should occur at most once it would slowdown the complete program.

The above problems convinced us that it is not possible to have an easy, safe and efficient parallel, object oriented language language that offers a way to asynchronously stop a thread. Instead, we introduced the disjunctive lock discussed above. A thread that needs to be stopped asynchronously can test a flag while executing and, if it waits on locks, it can simultaneously wait on a special lock which will signal termination. The appendix shows how a producer/consumer can be implemented in pSather such that all consumer threads stop as soon as all producers have finished and all items generated by them have been consumed - without any spinning.

C. Thread-safe Libraries

Over the last several years a large number of classes have been developed for Sather, and we would like to use most of them in pSather too. Any Sather class can be used in pSather as long as the programmer assures that all accesses to it are serialzed. However, we would like to provide classes that guarantee serializations. For most classes this means to put a lock statement in most routines.

We identified three different types of functions:

The designation of methods as visitor or mutator is part of library design. For example, methods that sort an array must be mutators and block access by visitors.

The code for initializing, updating, and testing iteration variables is often complex and error prone. The code to step through complex containers such as hash tables typically must utilize the detailed internal structure of the container, sometimes causing duplication of virtually the same code in many places. Ensuring correct synchronization while iterating introduces additional pitfalls.

To remedy this, Sather encourages encapsulation of iteration, abstracting iteration operations as part of the interface of the container class rather than a part of the code in the client of the container. Sather iterators are like routines, but instead of returning a value they can yield and later resume execution while still holding a lock. For example, most container classes define elt!:T as shown above, which yields another element on every call.

We implemented this protocol by adding a reader/writer lock to each such object and wrapping mutator and visitor methods with lock statements as shown in the following code:

	class LIST{T} < $LIST{T} is
	   include VISITOR_MUTATOR;  -- includes the code for the RW_LOCK
	   enqueue(e:T) is
	      lock mutator then
	         -- enqueue element
	   end; end;

	   head:T is
	      lock visitor then
	         return head_element;
	   end; end;

	   elt!:T is
	      lock visitor then
	         current_element:=head_element;
		 loop while!(~void(current_element));
	            yield current_element;
		    current_element:=current_element.next;
	   end; end; end; 

This shows a very important feature of the current design. The elt! iterator yields each element of the list one at a time with the guarantee that the list structure does not change during the loop. Unlike routine, iterator calls within lock statements do not give up the lock until the loop completes. The following loop uses this iterator to print out all elements of the list.

	loop #OUT + list.elt! + ", "; end; 

When elt! executes for the first time it locks the list as a visitor. This lock will only be released when the loop finishes and it guarantees that the list does not change during the loop. Until recently we did not allow yields inside locks, but we convinced ourselves that we needed this feature to implement thread-safe iterators.

Conclusion

The lessons we learned during the implementation of pSather during the last year on different architectures resulted in several new essential features, including the disjunctive lock and the possiblity to yield inside a lock. On the other hand we removed the "clear" function with the result that a pSather thread cannot be killed asynchronously.

We believe that those changes have brought us one step closer to a safe and efficient parallel object oriented language for SMPs and SMPs connected through high speed networks.

pSather Compiler

The currently available Sather/pSather compiler does not yet offer the possibilities discussed here, but a new version of the compiler should be available soon. Please check out the following WWW pages for more information (source code, language specification, examples and tutorials) and a list of papers and references about Sather and pSather (most of the papers are available online):

Sather home page: http://www.icsi.berkeley.edu/~sather
pSather home page: http://www.icsi.berkeley.edu/~sather/psather.html
Source code is available from ftp://ftp.icsi.berkeley.edu/pub/sather

Appendix

This pSather program shows how to implement producer/consumer with correct termination.

	class PRODUCER_CONSUMER is
	   producer(comm:GATE{INT}) is -- enqueue 1..5 in the gate
	      loop comm.enqueue(1.upto!(5)); end;
	   end;
	   consumer(comm:GATE{INT},prod:ATTACH) is
	      loop 
	            lock
		    -- as long as the gate is not empty, we
		    -- dequeue an element and print it out
		    when comm.not_empty then
		       #OUT+comm.dequeue+'\n';
		    -- if the gate is empty and all producers have
		    -- finished their work, the consumer ends too.
		    when comm.empty, prod.no_threads then
		       return;
	   end; end; end; 

	   main is
	      prod::=#ATTACH;  -- all producer threads are attached to this object
	      comm::=#GATE{INT}; -- gate used to send values to the consumers

	      loop 5.times!; -- create 5 producers that run in parallel
		 prod:-producer(comm); end;

	      parloop 3.times!; -- create three consumers
	      do consumer(comm,prod); end;
	      -- main thread waits here until all consumers have ended
	   end;
	end;