Wednesday, November 4, 2009

Starting Open Source Software

About a month ago, I wrote some guest posts for the Collective Imagination blog at ScienceBlogs.* Now that they've run their course over there, I thought I'd re-post them here in case you missed them the first time. Here is part 2 of 4:
In the first part of this article, I discussed the differences between Free software, and Open source software. In a Venn Diagram, these are concentric rings. The definition of "free software" lives within the definition of "open source software", which itself sits within the space of all kinds of software. That is, free software is a subset of open source software. So for the rest of this post, let me use the term "open source software" generically.

Let's look at open source software using a real-world example. To me, the FreeDOS Project will always be the first example I look to, so I'll use that. It should speak to the commitment of the open source software community that FreeDOS continues under active (if slow) development 15 years after it was conceived. How has FreeDOS held the interest of its users? Because FreeDOS embodies the important qualities that an open source project must possess in order for it to succeed.

But what are the "core" qualities for an open source software project to get off the ground?

Start by solving a problem

In 1994, I was a physics student at the University of Wisconsin-River Falls. I used DOS quite a lot to do data analysis, write papers, dial into the university network, and write small programs to make my life easier. DOS meant a lot to me, and I was very comfortable working with DOS to get my work done.

So it was a big surprise to me when Microsoft announced that they would stop supporting DOS with the next version of Windows, that everyone would soon move to Windows. If you remember the era, this was Microsoft's first acknowledgment of the sea change coming in Windows 95, which certainly was a huge step up for Windows. But at the time, the current version was Windows 3.11, which wasn't exactly user-friendly.

I didn't like Windows. It was klunky, it was slow, it made my work much more difficult. I felt I could accomplish the same tasks in DOS, mostly at the command line, and that I could do it faster than in Windows. In Windows, everything is done by pointing and clicking with a mouse. That just slowed me down, and I felt it was a sloppy user interface to get things done.

I wasn't alone. A lot of other people on various DOS news groups were shocked to hear that DOS would soon go away. They didn't like Windows any more than I did, and were just as resistant to being "forced" into Windows. And many of these people didn't have machines capable of running Windows 3.11, much less a "next generation" version of Windows with all new features. I had an 80386 with 4MB of memory (later upgraded to 8MB.) Many people still ran 80286 machines, and you can't run Windows on that, but DOS runs just fine. If Microsoft were going to push us all to Windows, we'd need to upgrade our PC's, and that didn't seem right. We felt as though our freedoms were being taken away as well, when Microsoft decided to take away MS-DOS.

So I had a problem. How could I continue to use DOS, if Microsoft was abandoning it?

On news groups, people trying to find ways to preserve their freedom. By 1994, Linux had become an underground success story in a lot of universities. I ran Linux on a separate partition on my PC, so I knew it was solid. We looked to Linux and asked, "If they can create a free version of Unix, why couldn't we create our own version of DOS?" Writing a single-tasking DOS system seemed almost trivial next to creating a multi-tasking, multi-user Unix kernel.

So I decided someone needed to write that version of DOS. I looked to the small DOS utilities I'd already written to improve on DOS, and started there. That became my first set of FreeDOS (then, "PD-DOS") utilities: CLS, ECHO, MORE, TYPE, VER, PAUSE. I released them so that other DOS users could use them. Over time, I built them up, added new programs like our first versions of DATE, TIME, CHOICE, DEL, FIND, and Unix equivalents such as TEE and MAN.

The origins of the FreeDOS Project could just as well apply to any other open source software project. In order for the project to exist at all, there must first be a need. A developer solves a problem for himself by writing a program, then shares the program with others so they can use it too. The key is that open source software projects also make the source code available, so that its users can help to add improvements.

Users should be developers

The basic definition of open source software is that the source code must be made available for others to see it. A necessary side-effect of this condition is that anyone who uses the program has an opportunity to make improvements. A well-managed open source software project will accept any improvements in the form of patches, which modifies the program to solve someone else's slightly different (but similar) problem. Releasing new versions of the software with the new features ensures that everyone benefits from these changes.

In the beginning, progress is usually very slow, because you may only have one or two developers making updates to the program. But as new versions are released, others become interested. The program doesn't need to be complete, but it does need to demonstrate that it can do something, that it has the potential to be useful. Then the new users may help add to the code, so the program gets even better. The updated releases generate even more interest, which attracts more users and developers. Repeat as necessary, and even a complex system can become achievable.

Take, for example, the FreeDOS Project's kernel effort. In 1988, Pat Villani started an experiment in writing a bare-bones DOS kernel that could support Pat's embedded device programming. This kernel was Pat's solution to a particular problem, how to re-use code on different platforms without having to re-write very low-level code each time. The DOS kernel changed very slowly over time, because it fit a very narrow set of requirements.

By 1994, Pat realized that others might be able to use the minimal DOS kernel he had written. At the same time, I posted on DOS news groups that I had released the first versions of my basic DOS utilities, and was in search of developers interested in writing a DOS kernel. Pat and I got in touch with each other (via the nice folks at Linux DOSEmu) and Pat's "DOS-C" kernel became the basis of the FreeDOS kernel.

When the source code to DOS-C was made available, the kernel did not support LBA, CD-ROM drivers, or networking. Additionally, floppy disk access was very slow.

But we released this as the FreeDOS kernel anyway. Despite lacking features, despite being slow, we had something that worked. Other developers became interested in what we had produced, and immediately began contributing updates. A developer named "ror4" provided a floppy driver that enabled buffered I/O, dramatically improving performance. James Tabor added networking and support for CD-ROM drives, later improved by Bart Oldeman and Tom Ehlert. Brian Reifsnyder provided LBA support.

As a result of opening the source code to its users, FreeDOS encouraged its users to be co-developers. Compared to Pat originally working on his own, progressing slowly ("cathedral" method) we had a mix of differing developers and approaches that created a coherent and stable system very rapidly ("bazaar" method).

Release early, release often

When many developers are involved in an open source software project project, many patches can be produced in a fairly short time window. It is important to maintain a constant feedback loop to the users, who are also the developers of the program.

As developers submit new patches to a project, it is important to package up the changes in a new release. This can sometimes be a frightening and daunting task, because the original creator of the program may begin to feel that he or she is losing control of the program. Rather, I encourage the viewpoint that the program is "evolving" beyond the goals originally set for it, and it is important to recognize new contributions as good.

The importance of making new releases is that the users/developers will be rewarded with frequent (possibly daily, if the rate of patches supports it) releases with new features. Yet, this can often result in an unstable release, especially at the beginning of a project when not everyone understands the code, and how changing one part can lead to unexpected behavior somewhere else. But over time, most open source software projects stabilize, so that as new versions are released, the program gets better and becomes more stable.

The frequency with which you release new versions will often depend on the size of the project. A small library such as FreeDOS Cats (an implementation of the Unix "catgets" function, which provides support for different spoken languages) might be released quite often. Sometimes, I released more than one version of Cats in a day as users sent me patches, and I released improvements on their fixes. A basic utility such as MORE or TYPE might be modified only in spurts, such as when I added support for Cats, but otherwise remain static. Software with a larger code base, such as the FreeDOS kernel, might take weeks to accumulate enough changes for a new release.

Projects need a coordinator or maintainer

Looking at the relative chaos of open source software development, with new versions released weeks or days apart, you may wonder what holds everything together. How do open source software projects not devolve into self-destruction? Someone needs to coordinate the changes that users contribute to a program. Someone needs to make the new releases. That person is the project maintainer.

An open source software project's maintainer (sometimes called a "coordinator", especially if the program has lots of people contributing to it) should have good communication skills. This person will be responsible for many things, including accepting and merging patches into the program's source code, helping to write documentation, listening to what users are saying about the program and finding ways to accommodate them.

But perhaps the skill that the project's maintainer will find most useful is the ability to listen. The maintainer must recognize that no single person will have all the correct answers all of the time. Insight may come from different directions, and it is the maintainer's responsibility to understand that many people working together on a project ("bazaar") are better than a single individual, no matter how talented ("cathedral").

When I founded the FreeDOS Project, I came into it with the naive view that most of my time would be spent writing code for FreeDOS, and that only a little of my time would be spent doing "housekeeping". My first contribution to FreeDOS was the basic utilities, followed by some kernel updates, the Install program. I thought it would always be like that.

In the early days, this was great. However, as the FreeDOS Project grew, I found my time shifted. I spent less time working on code, and more time answering questions and writing documentation. As more developers joined the project, and the FreeDOS distribution slowly worked its way to "1.0", more than 90% of my time was dedicated to coordinating various efforts across FreeDOS, less than 10% writing code.

After the release of "1.0" in 2006, I became completely hands-off. I no longer submitted patches to programs, I no longer wrote code for my own programs. FreeDOS had grown to the point where I no longer needed to be the expert. Others were pushing FreeDOS to do more things than I had ever dreamed possible in 1994, and I was glad to see it happen.

Each maintainer must similarly find his or her own motivation, and recognize that reasons for staying in a project may change. And that's okay.
In my next post, I'll discuss the organization of an open source software project, and a few of the features a project needs to support itself over the long run.

No comments:

Post a Comment