Open source isn’t the community you think it is

The irony is that what makes open source work—and differ from commercial software—is that only a few developers do the major work on any project

Comments

Name your favorite open source project, and the odds are good—very good—that a small handful of contributors account for the vast majority of significant development thereof. The odds are just as good that most of those contributors work for just one or a few vendors. Such is open source today, and such has been open source for the past 20 years.

So, does that mean open source is really just commercial software by another name?

No, it does not. But it means the popular stereotype of a broad community coming together to create software is a myth. The reality of open source is different than the myth, but still a good, positive alternative to commercial software.

Why only a few vendor-paid developers do almost all the work

Thirteen years ago, I dug into academic research that showed how Mozilla’s Firefox browser and the Apache HTTP Server were both developed by a small cadre of core contributors. While the population of contributors broadened with things like bug fixes, the central development work for these and virtually all other projects was done by a talented group of core committers.

Today, an analysis from Redmonk’s Fintan Ryan on projects housed under the Cloud Native Computing Foundation shows nothing has changed. Kubernetes is the most famous CNCF tenant, with Google and Red Hat contributing the lion’s share of code, but the other, lesser-known CNCF projects follow this same pattern. Indeed, perhaps the only real surprise in this fact of concentrated contributions is that the pattern has remained constant for so long.

Look at any CNCF project, Ryan has shown, and you’ll see that virtually all of its contributions come from fewer than ten people. In fact, if you drill down deeper, you see that most work is done by just two people on any given project.

As Ryan has written:

It is fair to say that for almost all of the projects in the CNCF, specific vendors account for most of the development work being done.

This is not to say that this is a bad thing—it is not; it is just a statement of reality. While the broad community around the projects may be large, the number of significant core contributors is relatively small, and the number of truly independent contributors is smaller still. This pattern is common across many open source projects.

Not just “many” open source projects—all of them. I can’t think of a significant counterexample. For big, diverse projects like Linux, if you peel away the overall wrapping and count contributors for the subprojects, you see the same phenomenon: A few developers, nearly all of them employed by vendors, generate a huge percentage of core contributions.

But if you step back, you realize it could only be thus. After all, anysoftware project degrades in efficiency the more bodies you throw at it (as Fred Brooks’s seminal book The Mythical Man Month anticipated).

As for why most of these developers would be funded by vendors, that’s easy to explain, too: Developers have rent to pay, too, and they can only afford to heavily contribute if they are paid to do so. Companies, pursuing their corporate self-interest, employ developers to work on projects that help their business.

Smart vendors understand how to use this to their advantage. Red Hat, for example, devoted part of its most recent earnings call to tout its Kubernetes contributions (second only to Google). As CEO Jim Whitehurst argued, those contributions let Red Hat both influence Kubernetes’s roadmap as well as better support its customers. Contributions, in short, give it a competitive advantage in selling Kubernetes.

What “community” really means for open source

So, is “community,” that mythical beast that powers all open source, just a chimera?

The easy answer is “no.” That’s also the hard answer. Open source has always functioned this way.

The interesting thing is just how strongly the central “rules” of open source engagement have persisted, even as open source has become standard operating procedure for a huge swath of software development, whether done by vendors or enterprises building software to suit their internal needs.

While it may seem that such an open source contribution model that depends on just a few core contributors for so much of the code wouldn’t be sustainable, the opposite is true. Each vendor can take particular interest in just a few projects, committing code to those, while “free riding” on other projects for which it derives less strategic value. In this way, open source persists, even if it’s not nearly as “open” as proponents sometimes suggest.

Is open source then any different from a proprietary product? After all, both can be categorized by contributions by very few, or even just one, vendor.

Yes, open source is different. Indeed, the difference is profound. In a proprietary product, all the engagement is dictated by one vendor. In an open source project, especially as licensed under a permissive license like Apache 2.0, there’s always the option for a new developer or vendor to barge in and upset that balance. Kubernetes is a great example: Google started as the sole contributor but Red Hat (and others) quickly followed.

No, this doesn’t help the casual corporate contributor that wants influence without making a sacrifice of code, but it does indicate that it’s possible to have an impact on an open source project in ways that proprietary products don’t afford.

In short, there’s little to fear and much to celebrate in how open source works. Indeed, it is precisely this self-interested seeking of individual corporate (or personal) benefit that should keep open source flowering for decades to come.

As should be evident 20 years into open source’s rise, the model works at both the community level and at the vendor level. Will it work for another 20? Yes.