You are viewing udrepper

Ulrich Drepper - Dictatorship of the Minorities [entries|archive|friends|userinfo]
Ulrich Drepper

[ website | My Website ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Dictatorship of the Minorities [May. 25th, 2005|08:51 pm]
Previous Entry Add to Memories Share Next Entry
Minorities can certainly always wreak havoc on the freedom of others. There have been plenty of examples throughout history where small group dictate the masses. This almost always happens with violence (dictatorships with the help of 1984-style mind control hasn't become known as of today).

Now you'd think that dictatorship of minorities couldn't happen in a realm where there is no violence and generally democratic or republic control. I'm of course about groups connected through the Internet. More specifically: collaborative development.

But this is where one is wrong. There often is such a dictatorship. The worst part is that it is self afflicted. The more structured and democratic of these collaborating groups give themselves guidelines and this is where things can go wrong. The guidelines are written, for real or just as a common understanding, in an effort to maximize the collaboration. This means getting as many people as possible and useful involved and they reflect the situation at given point in time.

This has a number of problems:

  • it is not always a good idea to maximize the participation

  • the circumstances in which the project lives change

The second point is certainly acceptable by all people, the first needs some explanation. The fundamental problem is that configuration options are bad. Be it at runtime or at compile. Ideally there is one configuration which works everywhere. Every new configuration increases complexity. Not linearly but instead exponentially. Each option might influence every other option. This is a disaster not only for users, but also the developers. It means exponential growth of testing. Which of course won't happen and therefore the code is basically untested. For developers this means that often only one or two configurations are really tested. Any us of another configuration is probably doomed to failure in any non-trivial project.

A second aspect is that multiple configurations mean diverts energies. Despite comments in the press, most collaborative software development projects are chronically undermanned. Many configurations means that at least parts of the energies devoted to the project by one person is of no interest to a second person who does not care about the affected configuration option. Integrate this over time and you'll get a lot of wasted energy.

Examples for this are plenty. Many of the GNU projects are ported to a wide variety of platforms, even to undeserving once like cygwin and mingw. The use of autoconf makes this to some extend quite easy (I'll write some more about winers who complain about the auto* tools at some later point in time). In the early days of the GNU project, a time most of who are using the code today know nothing about, this was kind of necessary since there was no GNU system (or simply a free+libre system) which could be used for the development. But this time is over. We have such a free+libre system today: Linux. There is no need to develop on those proprietary or irrelevant systems anymore. It does not further the cause of free software.

But even for Linux the situation is clouded. The OS (kernel + everything else) has been ported to many architectures. While I don't care a bit about it, I concede that having it available for some embedded architectures like Arm is useful. And there certainly should be more than one mainstream architecture to keep all the players real, so I welcome PPC in addition to x86/x86-64. But there is no benefit at all in supporting something like PA Risc, m68k, cris, etc as configurations of the generic code.

So the question is: why are there all these configurations? One answer is: because of violent minorities supporting such configurations. And supporting here mainly in the sense of demanding the existence and not in helping out with the project which supports these configurations. Another answer is: because people are too nice, induced by the spirit of the democratic/republic model of control.

Are these all unverified rants on my side? Well, judge yourselves:

  • IMO the most notorious case is how the gcc development is held hostage by Edelsohn and maybe IBM as a whole by requesting that everything always works perfectly well on AIX. How often has one seen this patch breaks AIX, back it out. It cannot reasonably be expected that everybody tests on AIX. It is an proprietary OS running on proprietary and expensive hardware which not many people have access to. The overall development speed could be significantly improved by dropping the AIX requirement which, in some form or another, has been agreed upon by the steering committee. AIX is irrelevant in general today, i.e., the situation changed. And the people in the steering committee are too nice to just tell the very small minority causing the problem to take a hike.

  • Deployment on windows systems requires many programs to be crippled. There are still problems with something as fundamental as upper- and lowercase characters, the huge differences in APIs, and many more problems. Crippling here means not only that the programs functionality isn't what it could be, it also means that through the used additional indirection to resolve the differences in APIs the code gets bigger and slower. And of course there is the problem that people used to Unix programming in general don't know enough about windows programming to not cause breakage and vice versa.

  • Looking at the complexity needed to cover up the differences between architectures should convince everybody that for every configuration people have to spend more time on a change than it would be needed otherwise (a concrete example: the Debian distribution of the older days where are architectures were required to work at the same level). Take the Linux kernel. The main code is mostly free of #ifdefs for architectures but at the price of having very complex macro and inline function combinations everywhere. Even if one learns how to use the abstraction layer for one arch doesn't mean one knows all the necessary details to not break another architecture. There is always ample proof to be found when reading the patches for non-mainstream archs which fix up patches which have originally been tested only on a few architectures. This additional work is thankfully not required from most patch authors (Linus realizes this problem) but the resolution of the problem still is time consuming for the general public (because Linus is too nice and does not dismiss the minority configurations).

The fight for saving the software world from the evil of proprietary, IP-enforced, non-transparent software has only started. One could argue that only a small piece of the puzzle is in place and that the upcoming tasks are much bigger. Look at required work like free MQ subsystems, free controlled runtimes, they are huge and haven't been tackled successfully in a free software project so far. This is why I encourage project leaders to take a good look at their projects and reconsider previous decisions:

Don't let Minorities dictate the direction!

Is it really necessary to support all these configuration options? Which ones are hardly used? It almost certainly helps the majority in terms of development speed and testing quality to drop the options used only by minorities. This does not mean minorities are deprived of their rights. This is free software, nobody can take away the code. But make the special interest groups pay the price for their options. For source code this means require the extra code to be maintained outside the project's sources. If necessary create an infrastructure to enable out-of-tree sources. For binaries it might mean somebody will have to recompile the binaries with additional patches and configuration parameters to get access to the extra functionality.

Which are the OS targets which should be supported? Support for proprietary OSes should be dropped. Companies like Sun and HP have for far too long survived without providing a decent compiler in the default configuration since there was always gcc. The Unix tools of most Unix OSes were/are appalling in quality and people didn't care since there always were the GNU tools (just read the studies made in the early 90's were garbage was fed to the various tools). Free software should only support free OSes and even among those the group needs to be trimmed significantly (ideally to one). Having just one is no problem (unlike in the case of processor architectures) since the entire code is free.

Which architectures are worthwhile supporting? There the situation is even more complex and interesting. Not only do we have to look for irrelevance (what percentage cares about Vax, PArisc) support, we also have to look at the level of added complexity the support requires. Some ABIs are just deliberately defined to be different from others (see IA-64) which requires huge amounts of effort to be spent. There are also significantly diverging capabilities (e.g., the lack of atomic operations in too many architectures). This far too often causes to unnecessarily crippled code since writing code in a way which allows optimal use in all situations is very difficult. The solution must be to restrict support to only a handful of architectures which are supported in the project. All other support must happen outside the tree and therefore all the work has to be done by the special interest groups.

I don't want to say we follow all these points perfectly, but for a big project glibc certainly comes closest to this. The only OSes supported in the tree are Linux and to some extend Hurd. The latter is certainly not a preferred target and does not hold back the development by requesting that no patch breaks the Hurd configuration. The supported architectures in the tree has been cut and probably should be cut even more. Only the architectures which are regularly tested (since they are supported in the various enterprise Linux distributions) are maintained plus a few historically supported architectures. The latter group should be cut down to the minimum. No global edits for the tree are required, they are voluntarily. I.e., fixing configurations which are negatively impacted by a patch is the duty of the people with interest in the configuration. All other configurations are maintained (or not) outside the tree. People can decide to use this additional repository at compile time to get access to the additional configurations. Nobody is robbed of the freedom but the costs are distributed fairly.

For my new projects the razor is even sharper. Only Linux is supported and only the few interesting mainstream architecture with reasonable APIs are maintained. Support for architectures with deliberately different APIs (i.e., IA-64) can be contributed. No other configuration is supported, actively or not, and people would have to exercise their right to add patches or fork the entire code to add other support.

IMO this is the only viable development model to allow free software to succeed since the remaining projects are huge in size and need concentrated efforts not diluted by diverging interests caused by a myriad of configurations.

There are undoubtedly people who will want the flame me to death. But these people are almost certainly all members of said violent minorities who want to force their opinions on the majority. Safe your breath, I don't care and I certainly won't change my mind on this. As in every society from the dawn of time, the benefits of the majority must outweigh the rights/benefits of the minority. If preserving the rights/benefits of the minority becomes to high for the majority, the losses must be cut. This is how the world works as can be seen every day on the business news. Nice people almost never win and especially not if your opponents are pure evil.