Rule #3: Divide and Conquer

Rule #3: Divide and Conquer


A constant pattern in the corporate environment is the gathering of resources, but with the free exchange of information inherent in commons-based projects, the pattern of choice is the dispersal of resources. This presents certain design challenges, which manifest themselves in the Unix-style "small sharp tools" approach to specialization; encourage "bottom-up design"; and most importantly require easy-to-obtain, shared, free standards for data interchange between programs. When every train car is to be made by a separate builder, it is essential that the rail gauge is constant and known.

RULE #3: Divide and Conquer

For large projects, establish a platform and interface standard, making it easy to contribute small, independent, pluggable elements

Concentrate on just one of these elements yourself

Community based project participants only tolerate very limited "buy-in" to platforms and standards and thus seek systems that will let them work more or less independently, with only the minimal interface requirements being placed on them by the platform.

Whatever interface requirements do exist must be explained succinctly and clearly so as to make the barrier to the platform as low as possible.

As much as possible, all new platforms should take advantage of existing interface standards, both to allow direct use of existing design resources targeted for those standards, and to minimize the learning curve of new standards as they arise.

Single projects should try to do just one thing very well, rather than trying to provide lots of features. This promotes use of the package as well as the platform it is built on, and encourages other features to be contributed by others.

Deconstructing "GNU/Linux"

When I tried to compare the size in "source lines of code" (SLOC) between "Debian GNU/Linux 'Sarge'" and "Windows Vista", the first problem to arise was that there really is no direct free software analog to the "Windows operating system". Instead of one single monolithic development project, the free community produces a swarm of smaller projects. However, by choosing a popular selection of projects, it's possible to build an "equivalent function" alternative to Windows, as presented in figure 10.1.

Figure 10.1: A comparison of the source lines of code and COCOMO-estimated equivalent cost of an equivalent stack of free software to what ships with Windows Vista (Repeated from Figure 10.1: A comparison of the source lines of code and COCOMO-estimated equivalent cost of an equivalent stack of free software to what ships with Windows Vista (Repeated from "Impossible thing #1: Debian GNU/Linux", Figure 1.3)

This figure certainly dispels the notion that Windows code might be smaller than Debian because of inefficiency in the free software code (indeed it suggests the opposite might be true), but what it also shows is a pattern of specialization wherein functionality is separated as much as technically possible and each functional unit is provided by separately-managed projects. Instead of one "operating system" project (for an ever-expanding definition of "operating system") like the proprietary "Windows Vista", the equivalent free software functionality is a "stack" of several completely-independent projects, each with a more narrowly-defined range of functionality.

Instead of one single monolithic development project, the free community produces a swarm of smaller projects

One thing figure 1 does not show is the range of choice that is also possible. This particular stack (glibc + Linux + GNU utilities + X.org + KDE + Mozilla) is only one popular choice out of many. This choice is made possible by the fact that each layer in the stack adheres closely to published interface standards. With relatively few problems, any of the stack layers can be swapped out with alternative programs providing similar functionality. Figure 10.2 shows an assortment of the options available. Even considering that this not a complete list of objects, the number of possible combinations (over 2000) is staggering!

Figure 10.2: Alternate choices are available for each layer in the free software stack that represents the same functionality as Microsoft's single Figure 10.2: Alternate choices are available for each layer in the free software stack that represents the same functionality as Microsoft's single "Windows operating system". Although some combinations are unlikely, these options represent 2×4×3×4×7×3 or 2016 possible "complete operating system stacks". This is a staggering combinatorial explosion of user options—made possible by adherence to standard interfaces

Some of the combinations shown (e.g. BSD libc + HURD + Xvfb + Kwin + XFCE + Nautilus) are pretty unlikely to be in actual use and therefore may be very buggy, but they should be possible. The greatest bottleneck is actually the X Windows system, since all of the examples given here are really forks or parts of the same project. Of course, given that these are free-licensed projects, one might wonder why anyone bothers creating "duplicate" functionality.

The truth is, though, that as with programming languages or word processors, there are a lot of fine differences which affect personal preferences among users, and there are fundamental differences in design philosophy and tools among developers. Additionally, there are disagreements about which licenses are "free enough" for certain parties' interests (e.g. there probably wouldn't have been a separate BSD libc, except for the desire of BSD developers to avoid dependencies on GPL software).

The use of smaller, more highly specialized programs, rather than big flagship applications, is a long-standing Unix environment tradition

The use of smaller, more highly specialized programs ("small sharp tools"), rather than big flagship applications, is of course a long-standing Unix environment tradition (even for proprietary Unix). It represents good engineering design in any context. But the pressures of expediency near the end of release cycles in the commercial proprietary software world tend to cause a breakdown in this engineering discipline. At the same time, the distributed and relatively disorganized community development environment, lacking any formal command structure, cannot sustain much in the way of large-scale coordination efforts. Thus, the community's very nature tends to reinforce the important engineering design principle of "separation of concerns".

"Too Many Cooks Spoil the Broth"

The wisdom of capitalist/corporate industry is to gather resources and put them under the control of one strong leader. Regimented control of the entire industrial process is thought to be essential to avoid conflict and waste when many people are trying to work together. Our capitalist economic system is designed to support this idea by providing means to concentrate wealth in the hands of those who are perceived as having the best chance of using the resources successfully and beneficially for society. Indeed, the system is named after this practice of "gathering capital".

Figure 10.3: The commercial/proprietary way of creating information products relies heavily on centralized control (Image Credit: Scott Maxwell / http://www.lumaxart.com / CC By-SA 2.0)Figure 10.3: The commercial/proprietary way of creating information products relies heavily on centralized control (Image Credit: Scott Maxwell / http://www.lumaxart.com / CC By-SA 2.0)

In addition of course, the system is competitive, creating a situation where corporate entities with sufficient capital often reproduce the same basic effort and the redundant products compete for market share. The theory is that the one that wins the competition will be a better product, thus encouraging corporate entities to put in their best effort and the public is provided with superior products.

Unfortunately, this depends on the public being able to tell when one product is genuinely better than another, and with things as complex as software, the comparison process can be extremely difficult. As a result, various proxies and superficial comparisons become more important than the core functionality of the software, and maintaining the extra burden of "feature wars" detracts from the engineering effort to produce good code and concentrate on the features which are most needed. Perceived value also relies heavily on "psychic value" established through advertising and on "network effects" after a winner begins to emerge. Thus a product may emerge quickly on the basis of a completely irrational value proposition, then hold that position due to the need for compatibility with others who are already sold on it.

The wisdom of capitalist/corporate industry is to gather resources and put them under the control of one strong leader

The most familiar example is probably the Microsoft Windows operating system, but there are many others. This is not a peculiarity of bad business practices from just one company, but rather a broad systemic fault caused by the incentive system and poor metrics available to the consumer—simply because the product is so complex and its inner workings are so secret that it is really impossible for a buyer to make an intelligent decision about which system to buy into.

In software, this approach leads to enormous flagship software applications with highly complex internal structure, great difficulty of maintenance, and poor interoperability with competitors (probably by design).

"Many Hands Make Light Work"

The wisdom of commons-based peer production is to require few common resources and make the ones that are needed as publicly-documented as possible and as available as possible to as many people as possible. This eliminates barriers to entry and participation on projects—which, without a strong profit motive (lost because copy sales are not profitable for free-licensed information products), is essential to the success of community-based projects.

Money can be made by users of a free licensed product, but they have limited ability to profiteer from it because they cannot exclude others from competition in order to maintain the limited monopoly position required to raise prices and thus offset their capital investment in the product. Instead, the contribution of individual participants in a free design project has to be regarded as part of their simple "overhead" costs, and thus, each individual can only afford a much smaller investment.

Figure 10.4: In the community development model, individual, finely-chosen pieces of the overall problem are picked and worked on in isolation by very small development teams (the most common size is one lone developer). The clear, complete, and simple design and specification of interfaces becomes very important in this model, because there is little opportunity to challenge, revise, or query them for unwritten details once the actual development process begins(Image Credit: Modified from works by Scott Maxwell / http://www.lumaxart.com / CC By-SA 2.0)Figure 10.4: In the community development model, individual, finely-chosen pieces of the overall problem are picked and worked on in isolation by very small development teams (the most common size is one lone developer). The clear, complete, and simple design and specification of interfaces becomes very important in this model, because there is little opportunity to challenge, revise, or query them for unwritten details once the actual development process begins(Image Credit: Modified from works by Scott Maxwell / http://www.lumaxart.com / CC By-SA 2.0)

This makes it essential both to contain the effort required so that it does not undermine the participant and also to leverage that effort as highly as possible by exploiting the contributions of others. By contrast, competitive advantage in the design serves little purpose since it has already been given away. As a result, the participants are motivated to minimize their collective redundant effort and maximize interoperability and extensibility of their contributions.

The wisdom of commons-based peer production is to require few common resources and make the ones that are needed as publicly-documented as possible and as available as possible to as many people as possible

This makes it critical to divide projects up into small manageable chunks; to make those chunks independent enough to allow development by unaffiliated parties; and to make all remaining interdependencies readily-available public knowledge. These requirements impose strict engineering disciplines of "separation of concerns" and "design by contract" in project designs.

Even in the case of "competing" packages like KDE and Gnome, which do essentially the same thing, there is communication of ideas. If something works well enough in Gnome, it will tend to be borrowed into KDE. Thus while competition does serve a positive role in allowing different ideas to be tried, and in serving different users with different needs and preferences, it does not lead to the negativity of a "zero sum game". Instead, a new "competitive advantage" may simply be ported from one package to another (as a quick example, look how quickly "window tabs" spread among major free software applications).

Common gauges and social codes

Standards-based interface design and high levels of component-ization mean that projects which might otherwise be seen as single very large projects deconstruct into dozens or even thousands of tiny projects. Ideally each such project falls within the capability of a single independent developer (in reality some engineering problems are hard to break down, and small formal or informal teams are needed).

Figure 10.5: Any system that must interact with other systems has to follow an interface, just as trains must run on tracks of a particular shape and size (called a Figure 10.5: Any system that must interact with other systems has to follow an interface, just as trains must run on tracks of a particular shape and size (called a "gauge"). The fact that there are many different track gauges in the world, occasions some serious design problems: the tracks on the left are designed to support two different gauges, while the picture on the left shows the wheels of a train being changed with passengers aboard, as must happen when crossing some European borders. There are many more such design adaptation indignities in the world of software, even if they are less visible to the end user. Much of the work of developers is simply getting around such obstacles (Photo Credits: Les Chatfield/CC By 2.0, Mark Tristan CC By-SA 2.0)

Of course this imposes a kind of design discipline which is extremely beneficial to the long-term engineering stability of the systems so-designed, and which is nevertheless extraordinarily rare in the products of corporate industry. It is this design discipline which is largely responsible for the perception of engineering superiority and practical robustness of open source software designs: systems of many interacting, interchangeable parts are much more fault-tolerant than "brittle" systems designed as monolithic pieces with high levels of interdependence between separately developed components.

Ideally each project falls within the capability of a single independent developer

As a result, the free software community is largely organized around data standards (like XML, HTML, SVG, ATOM, JSON, or ODT etc) rather than individual software packages (like MS Word or Adobe Illustrator). To hide such a standard or obfuscate it any way is seen as obstructionist and unconscionable (an attitude which is unfortunately not so prevalent in other engineering disciplines where standards are often only available for purchase—and are sometimes very expensive, effectively shutting-out small-scale developers).

Troubleshooting Tips

Factor, Factor, Factor

Especially if you come from a systems-engineering background, you may have thought of your project as one giant machine to be built. Dividing the problem up into functional blocks is something you usually think of as an internal engineering problem.

Unfortunately, such projects require a lot of "buy-in" from potential developers. They have to overcome the learning curve before they can begin to contribute to your project (or even use it). This puts off a lot of potential contributors who might really support your project if they could try it a little piece at a time.

Spend some time brainstorming about the various pieces of your project. What else can these things do? Try to break the project up into a lot of smaller, multi-purpose tools; make them as independent as possible; and present each of these to the community as a separate project (this kind of reorganization is called "re-factoring").

It's okay to describe your "master plan" somewhere, but keep the emphasis on the individual components. Other people will contribute to them for their own reasons.

By far, my own most successful projects have been "spin-off" projects like this: ideas I developed because I needed them as components, but which could be re-used by others.

Putting It All Together

It has been said that the structure of programs mirrors the structure of the organizations that created them, and this is true of free software as well as proprietary. A loose aggregation of developers with different interests and needs and very limited individual resources naturally tends to produce a large collection of narrowly-focused programs. A practical necessity of this structure is the use of well-defined, freely-available, and easy-to-implement standards.

These two patterns of design, imposed by the natural structure of commons-based development, happen to coincide with important principles of good engineering design (particularly the separation of concerns and well-defined interfaces) which can be hard to enforce in a commercial environment where time-pressure and expediency of communication can put short-term gains ahead of long-term stability.

With such a chaotic sea of unaffiliated project development, though, how is it that large scale structures (such as entire operating systems and distributions) arise in free software?

It has been said that the structure of programs mirrors the structure of the organizations that created them, and this is true of free software as well as proprietary

In the commercial environment, organization comes first: A "manager" sets goals, makes guesses about the difficulty of implementation, and assigns various "teams" to overcome them. By contrast, in the community environment, organization comes last: vague goals are suggested (often by high-profile community opinion leaders), self-selected "developers" choose to solve some of the intervening problems, and then finally "packagers" search for already existing packages; test and ensure interoperability with other packages according to selected standards and policies; and then build a "distribution" or "stack" which combines them into a working whole.

Figure 10.6: When common standards are used, separate development groups produce pieces that can be assembled into a larger whole by Figure 10.6: When common standards are used, separate development groups produce pieces that can be assembled into a larger whole by "packagers" (an important role with free software) after the fact, rather than before creating the elements (Image Credit: Scott Maxwell / http://www.lumaxart.com / By-SA 2.0)

Hard as this is in practice, it is remarkable that it is generally feasible. As long as the individual pieces have been designed against good interface specifications and tested against good implementations of them, the chance of fitting the pieces together into compiled, consistent, and functional code is quite high. Clearly some software-writing must happen in this adaptation task, and occasionally requests get back to the original developers, but for the most part the integration process is kept simple.

"Packagers" like the Debian Project are, in a way, the greatest heroes of the free software movement, because they are the ones who make the system "just work", despite its patchwork origins.

Author information

Terry Hancock's picture

Biography

Terry Hancock is co-owner and technical officer of Anansi Spaceworks. Currently he is working on a free-culture animated series project about space development, called Lunatics as well helping out with the Morevna Project.

Most forwarded

Interview with Dave Mohyla, of DTIDATA

Dave Mohyla is the president and founder of dtidata.com, a hard drive recovery facility based in Tampa, Florida.

TM: Where are you based? What does your company do?
DTI Data recovery is based in South Pasadena, Florida which is a suburb of Tampa. We have been here for over 10 years. We operate a bio-metrically secured class 100 clean room where we perform hard drive recovery on all types of hard disks, from laptop hard drives to multi drive RAID systems.

Anybody up to writing good directory software?

Since the very beginning, directories (of any kind) have had a very central role in the internet. (I have recently grown fond of Free Web Directory. Even Slashdot can be considered a directory: a collection of great news and invaluable user-generated comments. As far as software is concerned, doing a quick search on Google about software directories will return the free (as in freedom) software directories like Savannah, SourceForge, Freshmeat and so on, followed by shareware and freeware sites such as FileBuzz, PCWin Download Center and All Freeware (great if you're looking for shareware and freeware, but definitely less comprehensive than their free-as-in-freedom counterparts).

Interview with Mark Shuttleworth

Mark Shuttleworth is the founder of Thawte, the first Certification Authority to sell public SSL certificates. After selling Thawte to Verisign, Mark moved on to training as an astronaut in Russia and visiting space. Once he got back he founded Ubuntu, the leading GNU/Linux distribution. He agreed on releasing a quick interview to Free Software Magazine.

Is better education the key to finding better software?

I read David Jonathon's article Anybody Up To Writing Good Directory Software? the other day, which got me thinking about software directories in general. As David mentioned, many of the software directories one finds when doing a quick google search are free as in beer, not as in freedom. But what interests me is the software directories that already exist, providing a combination of both free as in beer software, and open source software. Sites such as Freeware Downloads and Shareware Download don't advertise themselves as providing free as in liberty software, but each of them have a good selection of open source software available... if you know where to look.

Most emailed

Free Open Document label templates

If you’ve ever spent hours at work doing mailings, cursed your printer for printing outside the lines on your labels, or moaned “There has got to be a better way to do this,” here’s the solution you’ve been looking for. Working smarter, not harder! Worldlabel.com, a manufacture of labels offers Open Office / Libre Office labels templates for downloading in ODF format which will save you time, effort, and (if you want) make really cool-looking labels

Creating a user-centric site in Drupal

A little while ago, while talking in the #drupal mailing list, I showed my latest creation to one of the core developers there. His reaction was "Wow, I am always surprised what people use Drupal for". His surprise is somehow justified: I did create a site for a bunch of entertainers in Perth, a company set to use Drupal to take over the world with Entertainers.Biz.

Update: since writing this article, I have updated the system so that the whole booking process happens online. I will update the article accordingly!

So, why, why do people and companies develop free software?

More and more people are discovering free software. Many people only do so after weeks, or even months, of using it. I wonder, for example, how many Firefox users actually know how free Firefox really is—many of them realise that you can get it for free, but find it hard to believe that anybody can modify it and even redistribute it legally.

When the discovery is made, the first instinct is to ask: why do they do it? Programming is hard work. Even though most (if not all) programmers are driven by their higher-than-normal IQs and their amazing passion for solving problems, it’s still hard to understand why so many of them would donate so much of their time to creating something that they can’t really show off to anybody but their colleagues or geek friends.

Sure, anybody can buy laptops, and just program. No need to get a full-on lab or spend thousands of dollars in equipment. But... is that the full story?

Fun articles

Santa Claus - the most successful open source project

It dawned on me the other day, as I was shopping for the dozens of gifts it seems I have to buy every December, that Santa Claus is the most successful open source project in history. (Bridget @ Illiterarty would agree with that). Santa Claus is essentially a marketing development that is embodied by everyone who stuffs a sock, gives a gift, hosts a dinner or wishes Merry Christmas over the holiday season.

Most emailed

Editorial

When I first started thinking about Free Software Magazine, I was feeling enthusiastic about the dream. I had Dave, Gianluca, and Alan willing to help me, I had established members of the free software community willing to help me out, I had writers volunteering their time and energy for free, and I had a generous offer from OpenHosting for servers, all before I'd proved myself. There was a sense of excitement in the air, and I thought maybe, just maybe, I could make this work.

Free Software Magazine uses Apollo project management software and CRM for its everyday activities!