Tuesday Apr 05, 2011

New Post

I am just doing a quick post to my blog.

Tuesday Apr 12, 2005

Open source licenses, IP, and CDDL

Although I've been trying to stay out of the open source license controversy (being generally more interested in code and developers than in legal nuances and ideology), a few recent statements have prompted me to correct some misconceptions. I'm not a lawyer, but as one of the drafters of the CDDL I can at least comment on what the license says, and on our intentions in creating it. In an opinion piece in eWeek, Steven Vaughan-Nichols states that:

The CDDL model [...] puts IP control under the company - in the CDDL's case, the company is Sun.

Now, if you read through the text of the CDDL, you'll note that Sun is mentioned exactly once, in section 4.1:

Sun Microsystems, Inc. is the initial license steward and may publish revised and/or new versions of this License from time to time.

That means that Sun is allowed to create new versions of the license, and others cannot (without renaming it). This is simply to avoid confusion - we don't want 3 different "CDDL version 1.1" licenses out there. The GPL has a similar provision, with the FSF allowed to revise the license. The stewardship provision in no way gives Sun control over intellectual property released under this license. In fact, Sun may not even be a licensor or licensee - the CDDL is reusable, and can be used by parties other than Sun, just as the GPL can be used for non-FSF projects.

It's important to note that someone licensing code under CDDL can specify a specific version of the license. For example, I can release the code for the Xyzzy tool under CDDL version 1.0 only. If I do that, then any subsequent versions of the license will have no effect on the terms of under which the code was released. This is significant in contrast with licenses like the MPL and CPL, which allow the license steward to revise the license and automatically allow licensees to choose between the original license and the revised license, regardless of the wishes of the licensor or overall community. In addition, the CDDL allows the venue, jurisdiction, and choice of law to be configured by the licensor; this allows use by people and companies in different geographical areas.

Now, rather than the license itself, perhaps Mr. Vaughan-Nichols is referring to Sun having control over the release of OpenSolaris code under CDDL. I'll agree that, at least initially, Sun will have greater rights to the OpenSolaris code than other people. But that's true no matter what license is used. Under any license (open source or proprietary), the licensor has more rights than the licensee, since they have full rights to the code (assuming that the licensor is a full copyright holder). Even under very permissive licenses like the BSD license, the licensee is required to preserve the licensor's copyright notice, while the licensor can change or remove that notice. The licensor can even release code they "own" under multiple licenses that would otherwise be incompatible; MySQL (which is dual licensed under GPL and a GPL-incompatible proprietary license) is a notable example of this.

Does the choice of license mean that the creators of Solaris wanted to have some degree of control over what happens to the code? Yes, I think that's a fair statement - just as the creators of Linux or MySQL wanted some control over their code. In the case of OpenSolaris, we wanted to allow people to make independent changes to the code, but promote the sharing of those changes - hence the "copyleft" requirement that anyone distributing binaries must also make the corresponding source available. But we also wanted to let people extend and improve the code with code available under different licenses (whether open source or proprietary), so the requirement extends only to the files released under CDDL (not files they may be linked with). Although I realize that not everyone sees this as a good thing (since it allows open source code to be embedded within a proprietary system), it's hardly an example of Sun controlling the IP - in this respect, the CDDL is more permissive than the GPL.

I've also seen concerns about the distinction the CDDL makes between Initial Developer and Contributor. The former refers to the person or company that initially releases the code under the license, while the latter refers to others who develop and distribute modifications to the code. The language and definitions are a holdover from the MPL. If you read the license, you'll see that the grants in each case mirror each other; that is, the Initial Developer grants the same rights as other Contributors. The primary distinction is that the Initial Developer can determine the license version for the code (and whether any revisions to the license should automatically be available as an option to licensees), as well as the choice of venue and jurisdiction.

I also find it interesting that those deriding CDDL and Sun, and stating that Sun won't be able to attract a community to OpenSolaris because of the license choice, seem to be overlooking another prominent corporate-sponsored community with an MPL-flavored license: Eclipse.

Monday Feb 14, 2005

Communities and diversity

Great piece by James Governer at Redmonk about the myth of a monolithic "open source community":

... there are many open source communities with their own licensing and governance approaches, lexicons, characters, superstars. Its a carnival mash-up, a diverse cornucopia of views attitudes and styles.


So Sun wants Solaris to compete with Linux. Well that's what diversity is all about. That's where innovation comes from; competition.

Sun's own Simon Phipps also has a discussion about patents, licensing, and CDDL. He nicely captures the intent behind the licensing choice for OpenSolaris. Be skeptical if you want, but we really are sincere about wanting to build an open source community around OpenSolaris. Not to destroy Linux and the BSDs, but to join them. And, yes, to compete with them to create the best open source operating system.

Tuesday Jan 25, 2005

On opening up Solaris (redux)

Well, we're finally starting our first public steps in the OpenSolaris project, with the release of the DTrace code today (along with more information about the overall project). Note that while we hope people will be interested in the code, we don't expect a community to form around a source drop. Think of this as our good faith deposit to show that we're serious, that we're not holding the "good stuff" back, that we're sincere in our plan to release the Solaris source code and build a community around it. There's information about the license as well. We'll be following up with more shortly. And if you are interested in details about the DTrace code, check out Bryan's blog.

Saturday Nov 13, 2004

Busy next week

Attempting to make up for going quiet the past couple of months...

On the 15th, I'm going to be at the Solaris 10 launch in San Jose, talking about zones (aka Solaris Containers) and general Solaris 10 stuff. A number of other Solaris engineers (and bloggers) will be there as well, talking about new features in S10 and doing demos for anyone interested. On the 16th, I'll be talking at ApacheCon about Solaris, open source, and operating systems futures (trying not to sound too pompous). The rest of the week I'll be at the LISA conference. We'll have a booth running Wednesday and Thursday (though I won't make it there until Thursday), a BOF session Thursday night, and Dan will be presenting a paper we wrote on zones on Friday. If you're attending any of these, look me up.

Friday Nov 12, 2004

Open development models

I've been thinking for some time about different models for how the initial developer of a technology (such as a company that has previously developed the technology under a proprietary model) can interact with an open source community. There seem to be a number of alternatives:

  1. Internal development: the initial developer makes source available for new versions, but does not significantly encourage community development. Users can send suggested changes back as part of bug reports, but have no real involvement with the actual development of the code. Any external site is usually focused on users rather than developers. This is really closed development with an open source code base.

  2. Community sponsorship: the initial developer pushes out source updates periodically, but sponsors a community site for open development on that source base. Community originated changes may be pulled back into mainline code base on a case-by-case basis, but in general the community site acts as a separate, independent, branch of the source.

  3. Initial-developer-led development: the initial developer leads an inclusive development effort, including participating in that effort in an open and transparent fashion. The initial developer helps establish "ground rules" for the community, but encourages participation (including in decision making) by others.

  4. Community-led development: the initial developer either gives up involvement entirely or participates as simply one of many developers, without a significant leadership role.

Although code is available as open source in each of these options, they represent a wide variation in terms of who can participate in development, and how such development is managed. In the case of options 1 and 2, internal development processes by the initial developer are essentially unchanged, and external participation is limited. Option 3 involves merging internal and external development processes, balancing between the goals of the initial developer and the requirements of external development. Finally, option 4 adopts external development processes without concern for the processes or goals of the initial developer.

I think each of these can work given different goals and priorities, but option 3 seems to be the only one that really represents collaborative development between the initial developer and a wider community. Thus, although in some ways this is likely to be the most difficult path (since it represents a balance between different goals and viewpoints), it can also be the most valuable for all concerned.

I'll talk later about the issue of extending and adapting development processes to work with a larger community.

Tuesday Aug 03, 2004

Door API details

I keep answering this question (or variations) in email, so I thought it might have wider interest. Plus this way I can point to the blog entry rather than repeating myself endlessly. One of the things I've worked on in the past is Solaris Doors. Doors are an inter-process communication mechanism with an RPC-like client/server interface. They differ from "standard" RPC by being (a) fast, (b) relatively simple, and (c) restricted to a single system. In addition, there are some features (particularly the ability to pass door references, and the unreferenced notification) that lend themselves well to implementing complicated distributed system semantics (in fact, the Sun Cluster 3.x product uses a CORBA-style ORB for inter-process and inter-node communication, part of which is implemented using doors). Doors are used fairly extensively within Solaris daemons and other system-level software that is shipped as part of the OS.

A door is created when a process (known as the door server) calls door_create(3DOOR) with a server function and gets a file descriptor back. That descriptor then can be passed to other processes or attached to the file system using fattach(3C). Once another process (the door client) has the descriptor, it can "invoke" the door by calling door_call(3DOOR). The client can also pass data and descriptors (including other door descriptors). As a result of the call to door_call, the client thread blocks and a thread in the door server wakes up and starts running the server function. When the server function is complete, it calls door_return(3DOOR) to pass (optional) data and descriptors back to the client. door_return also switches control back to the client; the server thread blocks in the kernel and never returns from the door_return call.

This leads to a problem: if I allocate data to return to the client via door_return, how do I free it? I can't free it before calling door_return, obviously, and control never returns to me after calling door_return (unless there's an error), so I can't just free it after the call. There are a few ways to handle this (in increasing order of complexity):

  • Copy the data to the stack. On each door call, the server thread's stack is "rewound" to the base. This implicitly frees any data on the stack, so any data that needs to be returned to the client can be first copied onto the stack (using a local variable or alloca(3C)), then freed before calling door_return with a pointer to the stack data.

  • Use thread-specific data. When a server thread starts running the server function, we know that any data previously by returned a call to door return from the same thread has already been copied into the client's address space. This means you can use thread-specific data to track previously returned data; for example, the server function could check and free any per-thread data stored due to prior door calls before continuing to execute. Note that, if the server thread is never re-used, the data will still be allocated. Other threads can't free this data since there's no way to make sure the data has been copied back to the client.

  • Use a door reference. When returning data that needs to be freed, create a door with the DOOR_UNREF flag and associate the door's unique ID (see door_info(3DOOR)) with the data in a hash table. Then, pass the door back to the client. The client should call close(2) on the door as soon as it receives it; this will send an unreferenced notification to the server (as long as the server still has one reference, since the notification happens when the reference count goes from 2 to 1). An unreferenced notification is just like a normal door call from the server's point of view, except that the data pointer is set to a special value (see the door_create(3DOOR) man page for details). When the unreferenced notification happens, the server can look up the unique id in the hash table and free the referenced data.

I've also considered extending the doors API to include something like a door_reply() function, which could be used (optionally) to specify reply data without losing execution control. On return from door_reply, the reply data will have been copied back to the client (or into the kernel), and the server can free the data from its address space. The control transfer back to the client would happen with a subsequent door_return() call (the arguments of which would be ignored). This is a bit slower than the standard door_return semantics (since two trips into the kernel are required), but makes freeing reply data and other server-side cleanup much simpler. Unfortunately, I haven't had time to actually implement this, or convince someone else to do it.

For those wishing more information on doors (particularly if the above didn't make any sense), there's a good introductory chapter in the second edition of Unix Network Programming, Volume 2: Interprocess Communication by the late Richard Stevens. The original idea for doors came from Spring OS, a research operating system developed in Sun Labs. The details were changed significantly in the transition to Solaris. There is also a Linux implementation based on the Solaris API, though it isn't part of the standard kernel.

Wednesday Jul 28, 2004


Sorry about the quiet period, I was on vacation for a couple of weeks and then busy with my day job. I'm now attending the O'Reilly Open Source Conference in Portland, OR. I and some other Solaris kernel engineers (Bart, Adam, perhaps others) are going to be hosting a BOF Thursday night to talk about what's in Solaris 10 and to discuss the plans to open source Solaris. If you're also at the conference and are interested, the details are here.

Saturday Jun 26, 2004

Comments and more

Thanks for all the comments. We're working on fixing the formatting issue - apparently we can enable "autoformat", which converts line feeds to HTML line breaks, but it needs to be set site-wide. If you have long comments, you can always send them by email to first.last@sun.com (replacing "first" and "last" with my name, obviously). Be sure to include whether I can post your comments or should keep them private, and whether I should use your name if I post them.

Alan Hargreaves suggests creating a BigAdmin forum for open source discussions, rather than using blogs. I think the general idea of a discussion forum is a good one, but BigAdmin doesn't really seem like the right place; it's more for technical discussions about specific technologies. In any case, lacking a good alternative, I'd like to use this blog as a medium for communicating (both ways) outside of "official" channels. Obviously, when the open source project goes live (perhaps earlier) we'll have discussion forums, mailing lists, etc.. (IRC, anyone?)

Alan also points out an article about open source Solaris in Fortune, discussing a letter the author received from Jonathan Schwartz. The text of the letter is included, and may be of interest to those wondering why we think this is a good idea for Sun, as opposed to the community.

Friday Jun 18, 2004

On opening up Solaris

One of the projects I've been working on lately is figuring out how we're going to make the Solaris code available as open source, and create an open development model around it allowing (and encouraging) contributors from outside the company. Some of you may have heard that Jonathan Schwartz (Sun's COO) recently announced that we're going to be doing this. We've actually been working on it for quite a while, but the public announcement has certainly increased the pressure (both internal and external).

There's been a lot of speculation about why we're doing this, whether we're out to "attack" Linux or whatever. From where I sit, this isn't at all what we're trying to do. We've been working on Solaris for a number of years, and are proud of what we've accomplished. We'd like to make it easier for more people to use it, and to help us improve it. We see open source as a way to enable that. If you prefer Linux, that's fine; I'm a firm believer in diversity and choice. In the end, diversity helps drive innovation, which helps the end user (and keeps me employed).

As you might expect, working on this involves lots of time spent meeting with lawyers about licenses and such. Obviously we have to worry about the legal stuff, but I'm also interested in hearing from other people outside the company about what you think we should do. Clearly we'll need to release the code under an open source (i.e., OSI approved) license, but beyond that, what do you think are the requirements? What about governance models? Are there any examples that you think work particularly well, or not so well?

Wednesday Jun 16, 2004

Joining the fray

I'm a Distinguished Engineer in the Solaris kernel development group, and have been working on various parts of Solaris for the past 10 years. Most recently, I was part of the team that developed Solaris Zones (aka "N1 Grid Containers"). This is a new feature available in Solaris 10 (available for download via Solaris Express) that lets you divide up a system into different application environments, where each environment is isolated from the rest. For more details, see the BigAdmin page where we've been posting information, or the work-in-progress paper we presented at the recent Usenix Virtual Machine conference. We're also working on a paper to appear at the upcoming Large Installation System Administration (LISA) conference.




« August 2016