CMT and MT

(Giri Mandalika has a new blog entry on how to use dbx's thread related commands. Check it out!)

This paper at ACM Queue is an excellent analysis of the state of the art in threading issues and problems, and what the development tools can do about it.

Software and the Concurrency Revolution

I've been thinking along these lines myself recently, so it was nice to see someone else had done the work of putting this all into writing. There are a few things I would add. Hmmmm. It turned out to be more than a few things by the time I finished writing what follows. The contents of the paper is much more worth reading than my humble thoughts. Go read that first, and then come back and read my comments. The paper itself is fairly easy to read, and not that long.

Back already? Anyway, here are my thoughts on the ideas presented in the paper.

Who needs MT tools?

There are lots of apps that are well-designed from the ground up to be multi-threaded. These apps all have a consistent strategy for dealing with shared synchronization. I think the developers on these projects are not actually clamoring for new threads-related tools and support. (Maybe they are. If so, clamor louder. Let me hear from you. ;-] ) The Solaris kernel used to use a home-grown tool called Warlock to do static checking of mutexes (this later turned into Lock Lint), but they don't seem to use it much anymore.

The apps that I have heard from that need the most help with finding deadlocks and tuning synchronization were very large serial apps that were turned into multi-threaded apps without redesigning them.

This doesn't mean we don't need tools to help with this, it just changes my perspective a little. If you are in the midst of redesigning a large app, you need a whole bunch of other tools too. You need static analyzers to sort out your dependencies, you need source browsers, you need interface checkers, etc, etc. None of these other features is specific to MT programs. But I think MT programs, have a general affect of ratcheting up the level of tools support you need across the board.

The best way to synchronize

The best way to manage the dependencies (any dependencies) between two modules is not to have any dependencies. The best way to share direct access to a data structure between two modules is not to do it. The best way to safely share synchronization between two modules, is not to share it. If at all possible, modules should use synchronization internally only, on the data that is controlled by that module itself. This doesn't mean that a module can be converted from thread-unsafe to thread-safe without disturbing the user of that module. Very often the API will need to be modified to export methods and functions that are structured in such a way that they can be implemented efficiently in terms of synchronization.

For example, if you have a module that contains a group of items that are managed internally to the module, the last thing you want to do is offer an iterator. If your module needs to support selecting a subset of these items, you want to offer a "search()" function that takes a predicate function as an argument, and returns a list of item-references. That way the module itself can be responsible for iterating in a safe and efficient way over its own collection.

As part of exporting this search() interface, you would document that the module prohibits reentry back into itself from the predicate. (At least for the same module instantiation) Ideally, you would enforce that restriction within the module itself.

The paper listed above touches on these issues when it talks about the benefits that functional programming languages offer (or don't offer).

locking around calls to other modules

The paper mentions several times that using locks around calls to other modules can provoke deadlock. This is a little bit oversimplified. It is really only true when you call a module that can call back into the current module, or if you share synchronization objects with the module that you're calling. In these circumstances, other problems occur, like data structures not being made robust to reentrancy. Calling a well-defined module while holding a lock doesn't have to be problematic.

This "reentrancy" issue is also a big deal for distributed applications as well. A distributed application that falls into a strict "client/server" model is fairly straight forward, but many distributed apps have multiple independant participants, or have callbacks from the server to the client. In those cases reentrancy can bite you pretty easily.

distributed programs

Many tools for dealing with MT programs are just as applicable to distributed programs. To repeat this different words, many tools for dealing with shared-memory parallelism are also applicable to distributed-memory parallelism. MT programs are more likely to need such tools because their nature is to be more tightly coupled between threads.

sedimentation as simplification

Sedimentation is what happens in a software system over long periods of time when functionality present in the app is gradually implemented in the platform or the software framework utilized by the application. (I know I've seen the term used before for software, but I don't see that usage on google.)

An application today may consist of multiple interacting modules. As some of these modules sediment into the platform (whether that is Solaris, Linux or the Java VM). This causes the remaining software (that is considered "the app") to be simpler and more maintainable. Of course, at the same time new functionality is being added to the app, so the net change could be simpler or more complex.

It may be that a parallel program today might become a serial program in a few years, when it is redesigned to use a new module available from the platform. An OpenMP program today might be rewritten to call the performance library tomorrow. From the application's point of view, it may have gone from being a parallel program to being a serial program. The Solaris OS is multi-threaded, but that doesn't make every program that runs on Solaris multi-threaded. What's important is not whether multiple threads are running inside your program somewhere. What's important is whether thread creation and thread synchronization need to be managed by your program.

The action that I have described here as "sedimentation" implies a huge amount of: testing, documentation, consensus building (to create an open, accepted standard for a platform extension) etc. One side-effect of all that work is that the new module (as part of the platform) is easier to use than it was before. This is a natural part of the software evolutionary cycle.

All of the problems described in the paper are made worse by having a large complex app with poorly defined interfaces and poor modularity. One way of looking at the future of threads is that the need to add threads will force an application to be "cleaned up" it will increase pressure to make the part of the app more modular. This will increase testability and decrease shared synchronization etc.

Benefiting from CPU performance growth

quote:

The concurrency revolution is primarily a software revolution. The difficult problem is not building multicore hardware, but programming it in a way that lets mainstream applications benefit from the continued exponential growth in CPU performance.

Exactly which apps need to take advantage of the "continued exponential growth in CPU performance"?

There are large applications and systems today that are not CPU bound, but are instead performance constrained by the network, or disk I/O, or constrained by a poor design, or constrained by the performance of the platform they run on top of.

For applications that are CPU bound today, some will need to become more heavily threaded to take advantage of CMT technology. Many of these apps are already threaded, and will need incremental work to create smaller chunks of work to increase the effective number of threads that can contribute.

For applications that are not CPU bound, there are also very good reasons to thread your application to increase performance, but those reasons are not related to CMT technology. The advent of CMT technology doesn't really represent a big change in the way apps need to be developed. It just amplifies a need for better MT tools that we've had for a long time.



Comments:

A quick thought about "The best way to synchronize" and "locking around calls to other modules":

I don't think one size fits all here.

One case is STL which doesn't provide any thread safety internally, but is fairly straightforward to use in a MT environment (by protecting all access).

On the other end of the spectrum CORBA components are generally already MT safe and don't require additional synchronization.

I don't think either approach is "wrong". The key, as Scott Meyer's puts it in "Effective STL" is to "Have realistic expectations about the thread safety of STL containers". I think we can expand what Scott is saying about STL to all components.

As a component author feel free to choose whichever threading strategy best suits your components purpose, just make sure the contract is obvious!

As a component user, make sure you are familiar with the threading contract before you get under way! This is where open-source makes a huge difference; sometimes you have to "use the force" to determine what the contract is ...

Posted by Ken Sedgwick on February 01, 2007 at 07:08 AM PST #

Wow thats a lot to read when I look at yours, the artcile you link, and the article it mentions as well!

Antonio V.(http://blogs.sun.com/swinger) has occasionally talked about the multi-threaded & dual/multi core topic as it relates to development, with some focus on java related, which I have put my thoughts on the subject down to discuss with him which has been something I've always wondered and thought.

When I get home tonight I'll read those and try to give my two cents on it then. But we are closing in on a project deadline here at work (team lead & proj lead say, 'More Code, less fluff!!'), so I better get back to that. Oye!

Posted by Jeffrey Olson on February 01, 2007 at 07:08 AM PST #

I read the articles and my current thoughts on it is that until the masses are more educated in threading/parallization then the tools they use need to offer some sort of support to find issues, yet as pointed out, that is very very unlikely due to the complexities of it.

Which then leads back to the whole educating the masses on the proper useage of such things... And I'm not sure thats a resonable expectation either. I know from people I work with they talk it up like its second nature, but they have yet to ever do anything that truely uses these paradigms and the attempts they have made, they haven't truely succeeded on. But the one thing I do know about them, they will try, learn, retry, relearn, until they know it and it might take a good while but they will master the concepts and hopefully figure out the best ways to use it.

Yet, this wasn't even really on their radars because of the ways cpu's progressed and the lack of a need to truely know such things, in their minds eye.

But with more and more about these ideas being talked about, I think the biggest thing that can help progress this will these early days of mass developer adoption to the good of all, or poor execution of the concepts that will lead to the general consumer thinking there is no benefit for this.

I'm cheering on the side adoption with good execution and absorbtion of proper use of such things. But time will be the indicator.

Posted by Jeffrey Olson on February 01, 2007 at 07:08 AM PST #

Post a Comment:
Comments are closed for this entry.
About

Chris Quenelle

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today