| Ulrich Drepper ( @ 2007-05-12 10:49:00 |
The Growing Importance of Parallel Programming
At the 2007 Red Hat Summit in San Diego which just which just wrapped up yesterday I gave a talk about parallel programming which the marketing folks retitled
The crux of the talk is that programmers in the future cannot always rely on improving hardware to make their programs run faster. This is summarized nicely in the following graph which I generated from performance data for x86 processors.

The crucial part is the divergence of the two lines going forward and the flattening of the blue line. This means programs which are not able to take advantage of ever increasing numbers of processing cores simply won't run (much) faster.
Parallel programming is hard. There are algorithms to change to allow more than one thread in parallel. Well, not necessarily thread, especially on Linux one should use processes if the sharing requirement between the processes makes this feasible.
There are data structures to lay out correctly to allow a) vectorization and b) data parallelization. Vectorization is important if one wants to come even close to the peak performance listed for the processor. But when you do this you also have to know a lot about CPU design (pipelines etc), caches, and memory.
And then there is something people might have heard about but didn't really register: co-processors are back. Intel's Geneseo and AMD's Torrenza are technologies to couple 3rd party processors tightly to the existing processor-memory mash.
In general I think the industry is entirely ill-prepared for these upcoming changes. Many/most programmers are not able to write code with these requirements. Companies and other organizations will have to invest into education. The system provides (like Red Hat) have to find ways to make parallel programming easier.
One big step in the right direction is OpenMP. Officially supported in gcc 4.2 Red Hat has backported the changes to our gcc 4.1 used in RHEL5 and Fedora Core 6 and later. Not only does OpenMP allow relatively easy conversion of existing code, it also frees the programmer from dealing with all the details of thread lifetime handling, thread stacks, etc. Even mutual exclusion happens at a higher level. All this is good, It will make programmers more productive if only it is used more often.
But there is one more thing: the OpenMP runtime is basically in complete control. It can decide on using just one thread or many threads. It can decide where to run threads and many more things. All these details are hidden from the programmer. This is a good thing since it allows the runtime to perform optimizations. I'll have more about this at a later date.
In summary, programmers have to learn, re-learn or for the first time, about parallelism. I think the topic of this talk is very important. If you are a Red Hat customer you could potentially ask for somebody from Red Hat to come in and talk about these issues. I'll give the slides and the details to our consulting organization and possibly also sales engineers. I cannot make any promises but I'll encourage those gals and guys to be willing to talk about this. If you're a big enough customer and you demand it, I might (have to) come out myself, if this is wanted. Or somebody can organize gatherings in places I have to go to anyway and have me speak there.
At the 2007 Red Hat Summit in San Diego which just which just wrapped up yesterday I gave a talk about parallel programming which the marketing folks retitled
Programming for tomorrow's high speed processors, today.
The crux of the talk is that programmers in the future cannot always rely on improving hardware to make their programs run faster. This is summarized nicely in the following graph which I generated from performance data for x86 processors.

The crucial part is the divergence of the two lines going forward and the flattening of the blue line. This means programs which are not able to take advantage of ever increasing numbers of processing cores simply won't run (much) faster.
Parallel programming is hard. There are algorithms to change to allow more than one thread in parallel. Well, not necessarily thread, especially on Linux one should use processes if the sharing requirement between the processes makes this feasible.
There are data structures to lay out correctly to allow a) vectorization and b) data parallelization. Vectorization is important if one wants to come even close to the peak performance listed for the processor. But when you do this you also have to know a lot about CPU design (pipelines etc), caches, and memory.
And then there is something people might have heard about but didn't really register: co-processors are back. Intel's Geneseo and AMD's Torrenza are technologies to couple 3rd party processors tightly to the existing processor-memory mash.
In general I think the industry is entirely ill-prepared for these upcoming changes. Many/most programmers are not able to write code with these requirements. Companies and other organizations will have to invest into education. The system provides (like Red Hat) have to find ways to make parallel programming easier.
One big step in the right direction is OpenMP. Officially supported in gcc 4.2 Red Hat has backported the changes to our gcc 4.1 used in RHEL5 and Fedora Core 6 and later. Not only does OpenMP allow relatively easy conversion of existing code, it also frees the programmer from dealing with all the details of thread lifetime handling, thread stacks, etc. Even mutual exclusion happens at a higher level. All this is good, It will make programmers more productive if only it is used more often.
But there is one more thing: the OpenMP runtime is basically in complete control. It can decide on using just one thread or many threads. It can decide where to run threads and many more things. All these details are hidden from the programmer. This is a good thing since it allows the runtime to perform optimizations. I'll have more about this at a later date.
In summary, programmers have to learn, re-learn or for the first time, about parallelism. I think the topic of this talk is very important. If you are a Red Hat customer you could potentially ask for somebody from Red Hat to come in and talk about these issues. I'll give the slides and the details to our consulting organization and possibly also sales engineers. I cannot make any promises but I'll encourage those gals and guys to be willing to talk about this. If you're a big enough customer and you demand it, I might (have to) come out myself, if this is wanted. Or somebody can organize gatherings in places I have to go to anyway and have me speak there.