Impossible thing #1: Debian GNU/Linux

Impossible thing #1: Debian GNU/Linux


With any paradigm shift, it is difficult to see the new world from the old one, even though it is glaringly obvious once you've crossed over. Empirical evidence is one way to bridge the gap. Let's look at some solid evidence for the success of what is probably the most obvious "impossible" achievement of commons-based peer production: free software, as exemplified by the Debian GNU/Linux distribution.

First, of course, we need to start with the myth, promulgated by a couple of decades of proprietary software dominance with an economic model founded an the manufacturing analogy:

Myth #1

"Free software development can create simple programs and utilities, but it's useless for anything large scale"

This may seem almost comic in the pages of Free Software Magazine, but I still hear this myth spouted by otherwise intelligent people in other communities. It's such a strongly held belief because it is all that conventional wisdom allows for works created by "amateurs" in their "spare time". The professionalist belief is that only paid professionals can produce quality, well-engineered designs and that professionals cannot possibly be paid to work on something you can get "for free".

How Big is "Big"?

It's difficult for people to compare the scale of free and proprietary products with each other. First of all, it's tempting to say that software is "worth what you charge"—that is to say, that you could evaluate it by its sale price. In that case, proprietary software is a multi-billion dollar industry, so how can you even compare it to the free-licensed software industry which doesn't charge per-copy costs?

We don't value software for software's sake. We value using it to do stuff

However, we don't value software for software's sake. We value using it to do stuff. Thus, use value is the relevant metric, not sale value. Unfortunately, that too is hard to count, though you can look at the figures for how many people are using free software.

But there is another way to ask the question: if you had to start from scratch and re-build the existing body of software, using a proprietary development company, how much would it cost you? At the very least, free software must be worth the effort that people are putting into it (or they'd stop working on it). So, how much is that?

Measuring the size of "free software" is not easy. First of all, you have a hard time finding it all. Then there's deciding what is worth counting and what is just cruft. Fortunately, this has largely already been done for us—in the form of Debian GNU/Linux, the world's most complete free software GNU/Linux distribution.

There is a project being conducted by the LibreSoft Research Group[1] to collect and analyze metrics data for free software projects based on "source lines of code" (SLOC), a relatively easy-to-measure statistic for any kind of software. These data are generated by David Wheeler's SLOCCount[2] program, which uses COCOMO[3], a long-standing and fairly simple model, to estimate the costs of software projects under the assumptions of centralized, managed software development. Now this is probably not an accurate measure of the actual effort cost of developing free software (although I leave it as an exercise for the reader to guess whether it's too high or too low), but it does provide an order-of-magnitude estimation and a consistent metric which allows for comparisons. Figure 1.1 shows the results of this estimation, showing the growth of the distribution with each release.

Figure 1.1: Equivalent cost of producing Debian GNU/Linux if it were to be developed in a conventional centrally-managed development setting. This can be regarded as an approximate lower bound for its Figure 1.1: Equivalent cost of producing Debian GNU/Linux if it were to be developed in a conventional centrally-managed development setting. This can be regarded as an approximate lower bound for its "use value" (because the community had to want it that much to spend that much effort on it).

Now these are very large numbers. It's easy to lose perspective. So let's throw in some comparable project costs to give an idea of the scale of project we are talking about (Figure 1.2).

Figure 1.2: For scale comparison, here are the costs of Debian (as well as GNU and Linux, considered separately), along with the actual reported costs of NASA space mission development projects (these do not include extended operations costs and the cost for the Space Shuttle is only from initial design to the 1981 maiden flight of Columbia).Figure 1.2: For scale comparison, here are the costs of Debian (as well as GNU and Linux, considered separately), along with the actual reported costs of NASA space mission development projects (these do not include extended operations costs and the cost for the Space Shuttle is only from initial design to the 1981 maiden flight of Columbia).

Yes, that's right. The estimated equivalent cost of developing Debian 4.0 "Etch" is not quite three-quarters of the actual development cost of the Space Shuttle[4]. And that's in adjusted dollars[5] (inflation has been applied to these numbers to make for a fair comparison, otherwise Etch would seem much more expensive than the Shuttle). If that doesn't convince you that free software projects can be large and complex, I don't know what would.

The estimated equivalent cost of developing Debian 4.0 "Etch" is not quite three-quarters of the actual development cost of the Space Shuttle

But is it Better?

"Measuring programming progress by lines of code is like measuring aircraft building progress by weight." —— Bill Gates

The large size of free software projects could simply be due to inefficiency. After all, both Microsoft Windows and Mac OS X have been criticized on these grounds as being "bloated". However, comparing the size of Microsoft Windows to the entire Debian GNU/Linux distribution is like comparing an apple to an entire orchard: Debian is not an "operating system", it's a complete collection of software. The equivalent in the proprietary world would be something like "the entire software inventory of CompUSA".

So, to make things comparable, let's strip down the Debian numbers so that we only include the things that provide equivalent function to what comes out of the box with, say, Microsoft Windows "Vista" (estimated at 50 million SLOC). So, we'll need what computer scientists (as opposed to marketing executives) call the "operating system"—which includes the Linux kernel, the GNU libraries, and the GNU utilities (not the entire GNU project, which today is dominated by application software packages like Gimp). Then we'll also need the graphical windowing environment, which means the X server packages, and—since Microsoft continues to bundle it—the web browser, which is from the Mozilla project. Microsoft Office is not bundled with Windows, so we'll leave out OpenOffice.org.

This allows us to compare "equivalent stacks" of software:

Figure 1.3: The free software stack providing the same functionality as Microsoft Windows does Figure 1.3: The free software stack providing the same functionality as Microsoft Windows does "out of the box". The left side compares by source lines of code, while the right compares COCOMO-estimated cost (based on the assumption that the free software projects are developed as independent projects and Windows is developed as one unit).

Not only is the free software stack less than half the size of Windows, but, because it is factored into distinct packages, it is easier to maintain. It could be that the need to divide the work among independent projects forces engineering discipline to be followed, and it is apparent that this pays off in the long run in more compact code (it's also interesting to note that this Debian source code also supports eleven binary builds for different CPU architectures, while Windows only supports one).

The need to divide the work among independent projects forces engineering discipline to be followed

So, not only does free software represent a vast amount of effort, but it is apparently very well-engineered and efficient effort leading to an even higher use value than equivalent proprietary products! Not only can free software manage large, complex projects, but it appears to do it better than proprietary methods.

Notes

[1] LibreSoft Research Group: Debian Counting. The published numbers include the overall totals and the results for each Debian source package. I have grouped packages using dependency rules and string-searches (taking advantage of Debian's package naming conventions) in order to produce the aggregates used in the Windows comparison as well as the divisions in the bar charts.

[2] SLOCCount (Software Package). Program used by the Debian Counting project to generate their results.

[3] COnstructive COst MOdel. Used by the SLOCCount program.

[4] Space Shuttle Program Cost Testimony of Mr. Robert F. Thompson, taken from the Columbia Accident Investigation Board public hearing on Wednesday, April 23, 2003. Places the cost of the Shuttle at an estimated $5.15 billion in 1971 dollars or an actual $8.5 billion in 1981 dollars.

[5] Measuring Worth. Online source and calculator, demonstrating six different deflation formulae used to compare different kinds of cost over time. The one most appropriate for large government projects is the "GDP Deflator", which is what I used here to adjust space project costs. All of the cost numbers have been converted to year 2000 dollars for comparison purposes.

Terms

free software: In this book, the expression "free software" is always used in the jargon sense of "software offered under a free license" as described in the Free Software Foundation's "Free Software Definition". This is a somewhat unfortunate choice of jargon, as it is frequently misunderstood, but it is generally the preferred term used within the free software development community. For most practical purposes, the term is synonymous with "open source software".

SLOC: Source Lines Of Code are used as a metric of the size and complexity of a software package, and can, in bulk, be used to estimate the effort required to create it. It obviously can be inaccurate in specific cases, but is a good objective metric which is easy to measure.

Author information

Terry Hancock's picture

Biography

Terry Hancock is co-owner and technical officer of Anansi Spaceworks. Currently he is working on a free-culture animated series project about space development, called Lunatics as well helping out with the Morevna Project.

Most forwarded

Interview with Dave Mohyla, of DTIDATA

Dave Mohyla is the president and founder of dtidata.com, a hard drive recovery facility based in Tampa, Florida.

TM: Where are you based? What does your company do?
DTI Data recovery is based in South Pasadena, Florida which is a suburb of Tampa. We have been here for over 10 years. We operate a bio-metrically secured class 100 clean room where we perform hard drive recovery on all types of hard disks, from laptop hard drives to multi drive RAID systems.

Anybody up to writing good directory software?

Since the very beginning, directories (of any kind) have had a very central role in the internet. (I have recently grown fond of Free Web Directory. Even Slashdot can be considered a directory: a collection of great news and invaluable user-generated comments. As far as software is concerned, doing a quick search on Google about software directories will return the free (as in freedom) software directories like Savannah, SourceForge, Freshmeat and so on, followed by shareware and freeware sites such as FileBuzz, PCWin Download Center and All Freeware (great if you're looking for shareware and freeware, but definitely less comprehensive than their free-as-in-freedom counterparts).

Interview with Mark Shuttleworth

Mark Shuttleworth is the founder of Thawte, the first Certification Authority to sell public SSL certificates. After selling Thawte to Verisign, Mark moved on to training as an astronaut in Russia and visiting space. Once he got back he founded Ubuntu, the leading GNU/Linux distribution. He agreed on releasing a quick interview to Free Software Magazine.

Is better education the key to finding better software?

I read David Jonathon's article Anybody Up To Writing Good Directory Software? the other day, which got me thinking about software directories in general. As David mentioned, many of the software directories one finds when doing a quick google search are free as in beer, not as in freedom. But what interests me is the software directories that already exist, providing a combination of both free as in beer software, and open source software. Sites such as Freeware Downloads and Shareware Download don't advertise themselves as providing free as in liberty software, but each of them have a good selection of open source software available... if you know where to look.

Most emailed

Free Open Document label templates

If you’ve ever spent hours at work doing mailings, cursed your printer for printing outside the lines on your labels, or moaned “There has got to be a better way to do this,” here’s the solution you’ve been looking for. Working smarter, not harder! Worldlabel.com, a manufacture of labels offers Open Office / Libre Office labels templates for downloading in ODF format which will save you time, effort, and (if you want) make really cool-looking labels

Creating a user-centric site in Drupal

A little while ago, while talking in the #drupal mailing list, I showed my latest creation to one of the core developers there. His reaction was "Wow, I am always surprised what people use Drupal for". His surprise is somehow justified: I did create a site for a bunch of entertainers in Perth, a company set to use Drupal to take over the world with Entertainers.Biz.

Update: since writing this article, I have updated the system so that the whole booking process happens online. I will update the article accordingly!

So, why, why do people and companies develop free software?

More and more people are discovering free software. Many people only do so after weeks, or even months, of using it. I wonder, for example, how many Firefox users actually know how free Firefox really is—many of them realise that you can get it for free, but find it hard to believe that anybody can modify it and even redistribute it legally.

When the discovery is made, the first instinct is to ask: why do they do it? Programming is hard work. Even though most (if not all) programmers are driven by their higher-than-normal IQs and their amazing passion for solving problems, it’s still hard to understand why so many of them would donate so much of their time to creating something that they can’t really show off to anybody but their colleagues or geek friends.

Sure, anybody can buy laptops, and just program. No need to get a full-on lab or spend thousands of dollars in equipment. But... is that the full story?

Fun articles

Santa Claus - the most successful open source project

It dawned on me the other day, as I was shopping for the dozens of gifts it seems I have to buy every December, that Santa Claus is the most successful open source project in history. (Bridget @ Illiterarty would agree with that). Santa Claus is essentially a marketing development that is embodied by everyone who stuffs a sock, gives a gift, hosts a dinner or wishes Merry Christmas over the holiday season.

Most emailed

Editorial

When I first started thinking about Free Software Magazine, I was feeling enthusiastic about the dream. I had Dave, Gianluca, and Alan willing to help me, I had established members of the free software community willing to help me out, I had writers volunteering their time and energy for free, and I had a generous offer from OpenHosting for servers, all before I'd proved myself. There was a sense of excitement in the air, and I thought maybe, just maybe, I could make this work.

Free Software Magazine uses Apollo project management software and CRM for its everyday activities!