Wednesday Jan 27, 2010

Cross of Lorraine

Wednesday Mar 12, 2008

OGB endorsements

A few people have asked how I'm voting in this year's elections. Here, then, are my endorsements:

OGB:

  • John Sonnenschein
  • Brian Gupta
  • Joerg Schilling
  • Ben Rockwood
  • Stephen Lau
  • Al Hopper

Questions:

  1. No
  2. No

Sunday Mar 11, 2007

C-Teams and the ARC as Community Groups

Rich Lowe asked an excellent question about OpenSolaris government in response to one of Casper Dik's answers to the DTrace Community. Here's the question, and my answer. As always, the other candidates' responses are available in the above-referenced thread.

And what would be your (the general "you", not just Casper) plans to help make the ARC and especially the C-team more practically part of OpenSolaris process, rather than a part of Sun process we're exposed to from one side, but not, so far, fully involved with?

Of the two, the ARC is much more difficult to rationalise; I'll explain why below. As for the C-teams and the more general problem of consolidation management, I'll let this text from my position paper[1] do the answering:

One of the OGB's most important tasks will be to rationalise the Community Group structure into one which will allow meaningful self-government. The centerpiece of my plan for doing this is construction of Consolidation-Sponsoring Community Groups (CSCGs). Each of these groups will be given control over an existing consolidation. This structure is not unlike that which exists today in the misnamed Nevada Community, representing ON. But that Community does not govern openly, and other consolidations are entirely missing structure under which they can be governed legitimately. Since the Constitution provides for the Community Group as the unit of independent government, each consolidation requires one to oversee its progress. The CSCGs will be responsible both for controlling the content of their codebases and for providing guidance and leadership to project teams desirous of integration. They will be required to adopt a set of rules (harmonised but not necessarily identical across all CSCGs) for integration and apply fairly these rules.

The challenge associated with the ARC (or ARCs) is that it maps poorly onto the Community Group structure. It makes little sense to me that an Architecture Community Group would sit alongside, say, an Observability Community Group. Observability incorporates a number of subsystems in the OS which in turn need to be properly integrated into each project. So would Reliability, or Virtualization. Architecture is not another such feature set but rather the way in which all those features, along with the new ones offered by the project, fit together and expose themselves to other consumers. That is, Architecture is both a superset of and yet entirely disjoint from all other CGs' areas of interest. The practical effect is substantial overlap: we would expect each CG to offer project teams advice concerning how best to integrate their work with existing features (and, for projects directly related to the CG's area of expertise, what features it should offer to others). In some ways, however, this directly conflicts with the mandate of an Architecture CG, which is to provide architecture guidance to all project teams. In the current system, an observability expert cannot override the ARC's decisions with respect to a proposed observability project. Yet under the Constitution, the Observability CG is supposed to be self-governing. The defining question is what exactly the latter CG is expected to govern, and by what mechanism - the very question the Constitution so conspicuously fails to answer.

It's easy enough in my CSCG model to simply require that all CSCGs adopt rules requiring architectural review by a particular CG just as they should require other CGs with expertise in relevant areas to review and perhaps approve each project prior to integration. Indeed, this is not unlike the system that exists today. The CSCGs do indeed have complete control over their areas of responsibility, namely, the existing consolidations. But this leaves all other CGs less equal, their endorsements subject to veto and without any code of their own to govern. A logical conclusion one could reach on this line of thinking is that CSCGs and perhaps the ARC should be the \*only\* CGs. The reality on the ground thus maps poorly to the Constitution we've been given, suggesting that the Framers either did not consider this matter in sufficient detail or intended much more radical changes in either the structure of consolidations, global review processes, or both. Mr. Fielding in fact hinted at just such an intent[2]:

We don't need to enshrine one committee's view of how C-Teams operate in an organization-wide constitution because C-Teams simply aren't relevant to \*every\* activity at OpenSolaris, and the vast majority of comments we have received so far clearly indicate that the existing consolidation boundaries are arbitrary AND dysfunctional. Personally, I am hoping that the communities feel empowered to change the things that are obviously causing them harm right now, and let the consensus process ensure that the traditions are adequately promoted and maintained over time.

Presumably Mr. Fielding and perhaps others have some grand detailed view of how all these things should be made to fit together in the rather obvious presence of existing bodies of code with no associated governing units and vice versa. Unfortunately, they've not seen fit to share that view nor to stand for election themselves. If consensus does not emerge within a few months as to an appropriate way to map the (possibly modified) existing practical devices of government onto the new constitutional structures, I'll probably favour amending the constitution rather than spinning our wheels forever trying to shoehorn OpenSolaris into a framework that may well be inappropriate to our broader goals.

At some point I'd like to hear Mr. Plocher and others more intimately involved with the operation of the ARC Community express their views on how that Community could be made to fit into the new Constitutional world of governing CGs. Their testimony will be needed before the OGB as it considers how best to restructure the Communities into meaningfully self-governing units.

  1. http://blogs.sun.com/wesolows/entry/ogb_election
  2. http://www.opensolaris.org/jive/message.jspa?messageID=99494#99494

Friday Mar 09, 2007

DTrace Community OGB Questionnaire

Leaders of the DTrace Community had a number of questions for the OGB candidates. Here's a copy of the questions and my answers. You can also see the other candidates' responses in the DTrace mailing list archives.

  • DTrace is one of only a small handful of OpenSolaris technologies that has actually been incorporated into other operating systems. Thus, your position on dual-licensing is very important to us; what is your position on dual-licensing in general?

    As I noted in my position paper: (a) the OGB does not control licensing, and (b) to the extent that the OGB would be consulted on the matter, I'm opposed to dual-licensing.

    The well-known opportunity it offers for license-based forks is a significant drawback that would have to be balanced and more by compelling benefits. No one has yet articulated such benefits, and I have found no evidence myself that they exist. The advantages presented by proponents of such a licensing scheme appear to be predicated on the idea that the second license would be GPLv3 (which is not complete), and that its use would dramatically increase the size of our community by drawing in the FSF as a partner in our technical work. Those are two large 'ifs' for a 'maybe' we're poorly positioned to handle.

  • Do you agree with the conclusions and decrees of CAB/OGB Position Paper # 20070207?

    Generally, yes. See above.

  • The OGB is responsible for the representation of OpenSolaris to third parties. If a third party were to inquire about incorporating DTrace into a GPL'd Program, what would be your response or position?

    I would note that my lay reading of the GPL would preclude that party from distributing the resulting product without violating the terms of that license. I would also advise that party to seek legal counsel, as with any licensing concern. That's as far as I'd go, however; the OGB does not hold the copyright to DTrace and is not in a position to warn or litigate against infringers.

  • DTrace is currently a Community Group, but some could argue that it would make more sense for DTrace to be a Project in (say) the Observability Community Group. In your mind, what is (or should be) the difference between a Community Group and a Project -- and where should DTrace fall?

    These two questions are not necessarily well-linked. The difference between a Project and a Community Group is straightforward. A Project owns one or more gates and does direct technical work with the intent to add or improve a specific aspect of the software they contain. A Community Group is the unit of independent government as defined by the Constitution; it is responsible for directing and guiding Project teams and others doing work that affects a broadly-defined set of interests.

    Others have suggested that a Project is defined to have a defined life span (presumably terminating upon integration into a consolidation). I disagree with this definition - a project (like DTrace) which provides a large and useful set of functionality will never be fully complete unless and until it is replaced wholesale. So long as the Project's work remains in use, it is important that some collaborative unit exist to provide a home for those using and improving it.

    DTrace is unquestionably a Project. Whether it deserves a Community Group of its own[0] depends on the granularity at which we wish to distinguish among Community Groups and the amount of overlap among them. That is, if Observability is held to be a Community Group distinguished from others at the correct granularity, DTrace should not be a separate CG, as its function would be a strict subset of another valid CG's. Instead, the DTrace leadership would be expected to participate in the Observability Group's activities, offering guidance and advocacy for consumers of its work. As part of that transition, mutually acceptable agreements regarding contributorship grants and leadership structure would need to be in place regarding the merged community (much like any corporate merger). Alternately, however, I could envision a finer-grained set of Community Groups with some overlap; DTrace might fit alongside, for example, a Debuggers Community Group in such a scheme. My personal preference is for a smaller number of larger Community Groups, some of them controlling the long-term maintenance of consolidations and others providing guidance to project teams (and the consolidation owners) based on their particular areas of technical expertise. I believe this would promote a vision of our software as an integrated whole. Just as importantly, even under such a system, any large and ambitious Project would fall inside the scope of several Community Groups' areas of interest. Expecting project teams to interact with dozens of Groups' leaders would seem to introduce excessive and unneccessary complexity.

    [0] The existence of DTrace as a Project ought not preclude the existence of other Projects which seek to enhance it.

  • The Draft Constitution says next-to-nothing about where the authority lies to make or accept changes to OpenSolaris -- only that Projects operate at the behest of Community Groups, and that Community Groups can be "terminated" by the OGB. In your opinion, where does or should this authority lie? And do you believe that the Constitution should or should not make this explicit? Finally, under what grounds do you believe that a Community Group should be "terminated"?

    As I noted in my position paper, I believe the authority for code acceptance should reside with Community Groups responsible for the targetted consolidation. Those CGs would be expected to delegate some or all of that authority in turn to specific individuals forming the C-Team for a particular release. While some minor changes will be needed to this strategy to accomodate open development, the basic process has worked well for a long time, and I see little reason to alter it radically.

    As I've noted in several messages, I would prefer that the Constitution had made at least some of this more explicit. The absence of this specification leaves the OGB with a set of illegitimate Sun entities excercising effective control over matters the Charter clearly leaves to the OGB, and offers no transition plan, timetable, or framework in which to take over these functions. This will present an additional challenge to the first elected OGB.

    Community Groups formed under a coherent and comprehensive strategy such as the one I hope the OGB will provide should generally be terminated only for inactivity or other clearly self-induced act of dissolution (such as a voluntary merger with another Community Group, approved by the OGB). Unfortunately, we also have a large number of existing Communities which do not fit well within any strategy one could retroactively imagine, and the OGB will be obligated to rationalise this situation. The process of doing so will likely involve terminating a number of these Communities and/or merging them with other Communities to form strategically valuable Community Groups. In the process, it is not unrealistic to suppose that some Communities may be terminated without the consent of their leaders. The OGB should seek to offer reasonable accomodation to the leaders of such Communities and work with them to find acceptable solutions that fit the strategic plan. My hope and expectation is that events of this type would occur very rarely after the initial realignment.

  • The Draft Constitution says that Community Groups (and in particular, the Community Groups' Facilitators) are responsible for "communicating the Community Group's status to the OGB"; what does this mean to you?

    My understanding was that the Working Group introduced the position of Facilitator for the purpose of maintaining a single first-line point of contact for each Community Group. The OGB should expect each Community Group to provide its membership list as required by the Constitution on a regular basis, and for proposing desired changes in structure or termination (if any). Beyond that, I believe this requirement has little meaning to the OpenSolaris community; it seems to make more sense in the context of an Apache-like organisation in which many completely disjoint software engineering efforts are undertaken simultaneously by likewise disjoint groups overseen by the Foundation. Since the OGB is not responsible for technical decisions, it makes little sense to expect Community Groups to provide detailed information about the work they oversee in the absence of a specific conflict or other matter requiring the OGB's attention. In short, it makes no sense to sample data which you cannot usefully consume.

  • According to the Draft Constitution, "nominations to the office of Facilitator shall be made by the Core Contributors of the Community Group, but the OGB shall not be limited in their appointment to those nominated." Under what conditions do you believe that the selection of a Facilitator would or could fall outside of the nominations made by a Community Group's Core Contributors?

    The only example I can imagine is one in which the designated Facilitator has a proven history of unreliability or deception. It seems unlikely that such an individual would be nominated by a responsible Community Group, so in practice I doubt this clause will ever be exercised.

  • According to the Draft Constitution, "non-public discussion related to the Community Group, such as in-person meetings or private communication, shall not be considered part of the Community Group activities unless or until a record of such discussion is made available via the normal meeting mechanism." In your opinion, in the context of a Community Group like DTrace -- where a majority of the Core Contributors spend eight to ten hours together every work day -- what does this mean? Specifically, what does it mean to be (or not to be) "considered part of the Community Group activities"? And in your opinion, what role does the OGB have in auditing a Community Group's activities?

    I choose to interpret this as a Blue Sky provision, requiring that important decisions be undertaken in public with the opportunity to participate for all those whose input might be considered useful. Since the Constitution provides no definition of "Community Group activities" other than voting, by implication this works in the same way as similar provisions in Municipal charters.

    In the context of the DTrace Community Group, I take it to mean that matters which require a Community Group to vote must be presented on a public list with reasonable opportunity for comment before such a vote is taken.

    Outside of bootstrapping activities around organising and rationalising Community Groups, I see little proactive role for the OGB in auditing CG activities. The OGB should generally handle only conflicts which cannot be resolved within one or more CGs, and then only when requested by a party to the conflict. The Constitution does preclude the OGB from interfering with a CG's internal governance.

  • Historically, binary compatibility has been very important to Solaris, having been viewed as a constraint on the evolution of technology. However, some believe that OpenSolaris should not have such constraints, and should be free to disregard binary compatibility. What is your opinion?

    Those people are wrong. Binary compatibility is a great strength, one which can in nearly all cases be preserved without retarding progress. To the extent that binary compatibility requires deeper thought on the part of engineers, it also directly enhances the quality of new work. Solaris customers praise and appreciate this engineering philosophy and the results it offers them; we should offer the same benefits to customers of other distributions as well by maintaining compatibility and architectural consistency within all recognised consolidations. Naturally, consumers of OpenSolaris are free to incorporate the technology into their own products in whatever manner they choose, including the introduction of changes that violate these constraints. Such activities are outside the scope of the OGB to regulate.

  • If a third-party were to use and modify DTrace in a non-CDDL'd system, whose responsibility is it to assure that those modifications are made public? To put it bluntly: is enforcing the CDDL an OGB issue?

    The answer to the first question is "No one." Neither use nor modification triggers the requirement that those modifications be made distributed in source form (and additions, in particular, need not be distributed at all). Only distribution triggers this requirement, and it is extended only to those to whom binaries are provided. If such a party did distribute the binaries containing DTrace, it is that party's responsibility to ensure its own compliance with the license terms.

    Enforcement of the CDDL is not an OGB consideration. The OGB does not hold any copyrights and has not issued any licenses. If the OGB is notified of a license violation, it should (as a group of good citizens) pass the information along to the copyright holder, if his/her/its identity is known. For much of the code in OpenSolaris including DTrace, that copyright holder is Sun Microsystems, Inc. Further action is at the discretion of the copyright holder.

    It may well be within the scope of the OGB's activities to help educate contributors about the terms of the CDDL, but such a campaign would require the OGB to obtain legal counsel.

  • Do you have an opinion on the patentability of software? In particular, what is the role of the OGB -- if any -- if Sun were to initiate legal proceedings to protect a part of its software patent portfolio that is represented in OpenSolaris?

    The OGB does not own software patents (or any other property), and I have no position on the patentability of software in general. Sun has the right to enforce its property rights under the laws of the countries in which it operates, and the OGB has no authority to interfere with that enforcement. Since community members who adhere to the terms of the licenses offered for OpenSolaris have limited (but adequate for all uses permitted under the CDDL) licenses to patents represented within that body of code, there is no reason for the OGB to worry about this. If such an event were to occur, the OGB might profitably offer a simple statement to this effect, clarifying the facts of the situation and denying incorrect rumours. Whether such an action would be necessary or appropriate would depend on the specific circumstances.

  • When you give public presentations, do you run OpenSolaris on your laptop? Have you ever given a public demonstration of OpenSolaris technology?

    Yes, I use OpenSolaris exclusively with the exception of interoperability testing. Yes, I have demonstrated new technology in Solaris 10 (now in OpenSolaris as well) at OSCON in 2004 and 2005, and the early OpenSolaris build system technology at SVOSUG in 2005.

  • And an extra credit question: Have you ever used DTrace? When did you most recently use it, and why? The answers "just now" and "to answer this question" will be awarded no points. ;)

    Yes, I've used DTrace. I most recently used it earlier this week while diagnosing the behaviour of two machines in an HA cluster. I've also written a (never-integrated) System V IPC provider for OpenSolaris and introduced USDT probes to enhance the observability of several aspects of daemon behaviour.

Thursday Mar 08, 2007

OGB Election

OGB Election OpenSolaris Governing Board elections begin next week. In addition, a single question will be presented to the voters: Shall the proposed Constitution be ratified? Please take the time to read this important document and learn about the issues being debated by the candidates. As a candidate for an OGB seat, I can help you right here and now with the latter task. I'd appreciate an allowance of five minutes of your time to learn where I stand on some of these issues. I welcome questions; you can send mail to all candidates to ask your questions. I'll be posting here my answers to any questions I receive in this fashion.

  • The Constitution

    VOTE YES

    I've pointed out a number of issues with the Constitution (see the 'constitutional limitations' thread) and continue to believe that the proposal as written positions us poorly to achieve independence from Sun, accomplish useful technical work, and provide leadership. Nevertheless, the alternative (last paragraph) is unlikely to be any better, thanks to some unfortunate decisions made by Sun. Therefore I support ratification and urge you to vote in favour.

  • Community Structure

    We need a FULL-SCALE OVERHAUL

    One of the biggest gaps in the Constitution is how the existing codebases are to be managed, controlled, and led. Indeed, the document does not even acknowledge their existence, despite the fact that they are the primary purpose for and value in OpenSolaris's existence. One of the OGB's most important tasks will be to rationalise the Community Group structure into one which will allow meaningful self-government. The centerpiece of my plan for doing this is construction of Consolidation-Sponsoring Community Groups (CSCGs). Each of these groups will be given control over an existing consolidation. This structure is not unlike that which exists today in the misnamed Nevada Community, representing ON. But that Community does not govern openly, and other consolidations are entirely missing structure under which they can be governed legitimately. Since the Constitution provides for the Community Group as the unit of independent government, each consolidation requires one to oversee its progress. The CSCGs will be responsible both for controlling the content of their codebases and for providing guidance and leadership to project teams desirous of integration. They will be required to adopt a set of rules (harmonised but not necessarily identical across all CSCGs) for integration and apply fairly these rules.

  • Projects

    MINOR CHANGES are needed here.

    The bar for project creation is very low today: if two Members believe a Project ought to exist, it does. This benefits everyone by allowing virtually unrestricted exploration of new spaces and approaches, but it also encourages duplication of effort and expenditure of effort on projects which are not positioned to be successful. I would like to see this approach altered: instead of directing project creation requests to a giant unmoderated mailing list (see more on this below), I would prefer to see them directed to one or more Community Groups, including (when relevant) the CSCG to which the project is targeted for integration. During a one-week initial review period, members of those Community Groups would be expected to provide feedback on the proposed project, informing its backers of related or conflicting ongoing work, the need for inclusion of additional or alternate Community Groups in the review, and risks and opportunities the project would offer. Just as importantly, this is an opportunity for Community Groups to inform the project's backers of the actions and choices the project team would need to make in order to secure those Groups' endorsements. It is expected that, by the time a project seeks integration into a consolidation, it will have secured the endorsements of all relevant Community Groups; this process will give the project team a leg up on understanding what will be required to do so, and help them make contacts and forge working relationships within those Groups. At the end of the initial review period, the project team will be required to indicate to the OGB's project-creation delegate whether, in light of the feedback received, it wishes to proceed. This decision cannot be vetoed, but a project which fails to secure the endorsement of relevant Groups will have much more work to do later if integration is desired. It is worth noting that integration need not be a project team's goal: some projects may be worthwhile on their own, may eventually lead to the formation of new consolidations, or may be intended solely as exploratory efforts that may yield innovative work later used elsewhere. We must not discourage these teams nor should we send them elsewhere to do their work. At the same time, we should provide a framework in which project teams desiring integration can learn early what will be required and work continuously throughout the life of the project with the technical leaders of relevant Community Groups.

  • Dual- or re-licensing

    I am OPPOSED to either of these steps at this time.

    It's important to note that the OGB does not control the offered licenses to OpenSolaris source because it does not hold the copyrights. Only Sun can offer additional or alternate licenses. Therefore, this position is relevant only to the extent that Sun seeks the OGB's guidance on the matter. The arguments for and against changes to the licensing regime have been discussed at length; I will not repeat them here. I have two main observations: First, licensing changes appear to be a solution in search of a problem. No proponent of such changes has articulated clearly the problem(s) which such a change would solve. Given the risks and costs, I would expect a clear and convincing case to be made that license changes are necessary; that threshold has not been met. Second, the main benefits posited by advocates of licensing change center around an increase in the size and stature of our community. Unfortunately, we are ill-positioned for growth; our institutions and infrastructure are in dismal shape. Any large influx in contributors would lead to more complaints and flames but little additional useful work. If we desire to grow, we must first position ourselves to leverage fully our existing contributor base. Until then, a focus on growth makes no sense. Similarly, I have little concern for our 'stature' in the broader Free Software community. If the FSF or a similar organisation would like us to change our licensing to better suit their interests, or to form a partnership to deliver interesting and useful products, we should remain open to such offers if they would benefit all parties. Since no such offer has been made, and made openly, there is little reason to consider hypothetical partnerships as a key benefit of a licensing change.

  • Infrastructure

    The OGB and the Tools Community must exert leadership; BLIND RELIANCE ON SUN IS NOT THE ANSWER

    The OGB must formulate a plan with dates and milestones for opening defect tracking to community participation, establishing review, approval, and archival mechanisms for change submissions, and increasing the transparency and utility of the ARC process. The OGB must also establish rules that Community Groups will be expected to follow regarding acceptance and integration of opaquely-managed projects (namely, that non-grandfathered projects of this type must not be permitted to integrate until, at minimum, a sufficient period for public review). Since Sun currently has a variety of tools for managing these processes, it would of course be nice if they would make those tools available to us. However, Sun's resources for doing so are limited, and in some cases the tools are poorly-designed to be used outside a LAN. The most important such example is the Bugtraq2/Bugster defect tracking system. Lack of access to this system is a major roadblock to open development, and Sun has not offered a plan to address this problem. The OGB must seek a firm commitment from Sun to open access to this system in an acceptable way, and must hold Sun to agreed-upon milestones in that plan. If Sun declines to offer an acceptable plan for doing so, or fails to uphold its agreement, the OGB must assist the Community Groups, notably but not exclusively Tools, in designing and constructing suitable replacements. I would like the OGB to issue a Call for Proposals for solving the defect tracking problem with a deadline of May 20. Sun is especially invited to submit a proposal. The OGB would then evaluate the proposals, giving special weight to one which would allow access to the existing body of information in Bugtraq2, and establish and monitor progress toward a chosen proposal. Other infrastructure problems (code review and archival, ACLs and Wiki-like features, RTI handling, etc.) should be handled in a similar fashion. This general framework is proving itself effective in the SCM project and we should not hesitate to use it in the future rather than expecting Sun to "do something" "someday."

  • Communication

    The OGB MUST DO MORE to improve the signal-to-noise ratio, and to communicate its own activities more clearly

    Several have complained about communication of information about the election, and with good reason. The OGB has at times communicated poorly with the other members. I would like to see the OGB use opensolaris-announce (a read-only list containing all members) more heavily to communicate information of universal interest. Correspondingly, opensolaris-discuss should never be used to convey any official information, nor to seek input or feedback from all members. Instead, the OGB should provide a set of mailing lists open to all in which topics related to governance can be discussed. When input or feedback are desired on a particular issue, the OGB should announce a Call For Discussion via opensolaris-announce, pointing interested members to the appropriate topical list. Naturally, the traffic on -announce should be kept low, but neither should we be afraid to use it when appropriate: it is a highly effective way to reach all members without requiring them to subscribe to a largely useless list with minimal signal and excessive noise. I will recommend that the OGB adopt a policy that its members not subscribe to -discuss, so as to force the board to communicate with all members on an equal footing. In short, subscription to such a high-volume, low-S/N list wastes time and resources that could be better spent working on real problems in more focused venues. The OGB should strongly encourage the use of appropriate topical, project-, or Community Group-sponsored lists for technical questions, proposals, and announcements. The general discussion list may well be reserved for flames, offtopic "water-cooler" conversation, and sophomoric hand-wringing over OpenSolaris's future. No one who does useful work should have to filter such tripe in order to keep up with important news.

  • Culture and Leadership

    A QUALITY-CENTRIC ENGINEERING CULTURE is one of our greatest assets; the OGB must encourage and strengthen that culture.

    The OGB is not intended to make technical decisions; these are to be made by Community Groups. Nevertheless, the OGB must position these Groups to enforce sound engineering philosophy, and provide them with the tools and support needed to do so. There is far too often a perception that the "movers and shakers," those who want to "cut red tape" and "just solve problems" are the community's true leaders. At times, this is indeed true. But engineering also requires a sober, cautious approach to new problems, especially those which are poorly understood. The existence of process and review is neither an accident nor red tape. Instead, these tools help us make the right decisions - decisions that will remain with us for many years. The OGB should urge, and where appropriate, force, its Community Groups to keep this in mind as they evaluate proposals and requests. Expressions of enthusiasm and a can-do spirit are welcome, but should not be confused with commitment or full agreement. It can take weeks or months of work to validate or discredit a particular approach to a problem. Community Groups will be most successful which do not commit to a particular approach until that time has passed.

  • The role of Sun

    Sun's engineers are IMPORTANT CONTRIBUTORS but Sun Microsystems, Inc. is JUST ANOTHER DISTRIBUTOR of our technology and enjoys NO SPECIAL STANDING.

    One of the largest challenges the OGB will face is encouraging the formation of decision-making bodies that operate openly and are independent of Sun, while still ensuring that the interests of Sun and other distributors are well-served. Far too much of our activity today takes place entirely within Sun in a largely opaque fashion. For example, the Solaris PAC, an entity mentioned nowhere in the Charter or Constitution, still believes it has the authority to set integration rules for each build. And, in part because no alternate framework exists for making these decisions, SPAC in fact does - improperly - exercise this authority. The OGB is responsible for taking over these functions with respect to OpenSolaris and providing a framework in which these actions can be taken openly. None of this should be taken to imply that the OGB exercises control over Solaris (Sun's distribution); like any other distributor, Sun remains free to ship whatever products it wishes without regard for the OGB or any other action of the OpenSolaris community. But to the extent that it wishes to undertake actions which conflict with openly-established policies, it must branch or fork in order to do so. If we make our decisions properly, with input from all stakeholders and with adequate transparency, then Sun's or another distributor's choice to do so will be both healthy and desirable. It may not always be possible to meet the needs of every possible member of our community, and sometimes Sun's corporate interests may be the ones we cannot serve. For now, however, our focus must be on building credible and authoritative institutions which are independent but not ignorant of Sun.

    I should note that I work for Sun, although not for the business unit responsible for Solaris. However, I am running as an independent individual, not a representative of Sun or any other entity. I have in the past expressed skepticism and disagreement with Sun's (and Sun executives') positions on various issues of interest to our community, and I will continue to do so in the future when appropriate. The OGB is not beholden to Sun or anyone else, and its members are expected to act accordingly. Neither corporations nor corporate representatives are permitted to serve - by design. I expect your confidence in my ability and determination to act independently for the common good.

Wednesday Feb 08, 2006

A louder voice for the fault manager

The Solaris reference implementation of the fault manager recently got a boost in its ability to report faults with the introduction of a two-part SNMP agent. This agent makes it easy to integrate the Solaris fault manager into existing SNMP-based monitoring infrastructure.

Background

The fault manager has always been able to report faults to the system log and console(s), and to provide a wealth of status information via fmadm(1M) and fmdump(1M). But these reporting mechanisms leave much to be desired; syslog messages must be parsed, and a busy central log host can easily lose important messages in the noise. Worse still, a privileged user must log into the affected system and run administrative commands to get information they need that isn't contained in the message.

SNMP is a natural choice for extending the reach of the fault manager's voice; it's widely used to facilitate centralised monitoring of events throughout and even across administrative domains. The basic model is simple and extensible; information can be pushed from any device to one or more network management stations (NMSs), or pulled by an administrator or automated utility from a particular device of interest. Managed devices - in this case, a Solaris system - signify events using traps (also called notifications in SNMPv2), which provide a limited amount of information to designated NMSs. They also provide access to a management information base (MIB) on demand. Generally, the MIB provides access to a much greater breadth and depth of information than is transmitted with a trap or notification. An NMS can be configured to retrieve additional data from the MIB upon receipt of a trap if desired.

Availability

The technology described here is available in Solaris Nevada builds 33 and later. OpenSolaris offers access to the sources. A prerequisite for building or using these applications is the installation of the SMA packages provided by the SFW consolidation; BFUing newer ON bits is not sufficient. If you have SWAN access, you can run /ws/onnv-gate/public/bin/update_sma to get the necessary packages; otherwise see the OpenSolaris download center for the packages.

A Note on NMS Configuration

If you use the Net-SNMP-based NMS software delivered in Solaris, as I do below, you will want to tell the client utilities to use the fault management MIB to encode and decode OIDs. The easiest way to do this is to add MIBS=+ALL to your environment. You can also make this permanent by creating (or adding to) /etc/sma/snmp/snmp.conf the line:

    mibs +ALL
See snmp.conf(4) for more information on MIB searching and importing. If you use a different NMS, consult your vendor's documentation to learn how to import a new MIB.

snmp-trapgen: an SNMP plugin for fmd(1M)

The trap or notification generator component is snmp-trapgen. This is a very simple fault manager plugin similar to that which logs fault information to the system log and console. Instead of writing formatted text to a log device, however, this plugin generates SNMPv1 traps and/or SNMPv2 notifications, one for each destination configured in the systemwide snmpd.conf(4). No additional configuration is required; if you have already configured a system to send traps to one or more NMSs, you don't need to do anything else to be notified upon fault diagnosis. If not, you'll want to add v1 or v2 trap destinations to /etc/sma/snmp/snmpd.conf. The hostnames or addresses you use will need to be configured to receive and act upon SNMP traps or notifications. If you don't have an NMS on your network, you can use the snmptrapd(1M) server included with Solaris.

A fault diagnosis trap (sunFmProblemTrap) includes a limited subset of the information contained in the syslog message associated with the fault. Specifically, the diagnosis's UUID, diagnostic code, and reference URL are included. The object identifiers (OIDs) for these data are defined by the fault management MIB, SUN-FM-MIB, installed in /etc/sma/snmp/mibs/. The same information is delivered to both SNMPv1 and SNMPv2 trap sinks. At present, this is the only trap defined by the fault management MIB, but others may be generated in the future. Here's an example of an SNMPv2 notification as decoded by snmptrapd(1M):

2006-02-07 16:36:34 stomper [192.xx.xx.xx]:

        DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (2266748911) 262 days, 8:31:29.11
        SNMPv2-MIB::snmpTrapOID.0 = OID: SUN-FM-MIB::sunFmProblemTrap
        SUN-FM-MIB::sunFmProblemUUID."a58aa105-4fab-6e16-8557-ab7687113de7" = STRING: "a58aa105-4fab-6e16-8557-ab7687113de7"
        SUN-FM-MIB::sunFmProblemCode."a58aa105-4fab-6e16-8557-ab7687113de7" = STRING: SUN4U-8000-KA
        SUN-FM-MIB::sunFmProblemURL."a58aa105-4fab-6e16-8557-ab7687113de7" = STRING: http://sun.com/msg/SUN4U-8000-KA
The diagnostic code and URL can be used to find knowledge base articles describing the fault and suggested corrective action. The diagnosis UUID can be used to get further detail from fmdump(1M), or from the MIB, as seen in the next section.

libfmd_snmp: a MIB plugin for the System Management Agent (SMA)

Knowing that a fault has been diagnosed is important, but the amount of information delivered with the trap or notification may not be enough to provide an administrator with a complete understanding of the problem. The fault management MIB defines a wealth of detail, and this detail is made available via SMA by libfmd_snmp. In addition to fault diagnosis detail, this MIB also offers information about faulty components and the configuration of the fault manager itself, similar to that offered by fmadm(1M).

Enabling the plugin requires configuring the master SNMP agent on each server you wish to query. Adding the architecture-dependent line

    dlmod sunFM /usr/lib/fm/sparcv9/libfmd_snmp.so.1
to /etc/sma/snmp/snmpd.conf will cause the MIB plugin to be automatically loaded and initialised the next time the master agent is started, such as via /etc/init.d/init.sma. In the future, SMA will be managed via SMF; see 6349499[0].

No further configuration is necessary, although the usual snmpd.conf(4) directives will allow you to restrict access to the MIB, which may be important to you since some of the information it provides is ordinarily restricted to privileged users.

The fault management MIB provides 4 tables and a single scalar, in addition to the trap/notification described above. sunFmProblemTable and sunFmFaultEventTable are logically two pieces of the same table; they are separated only because MIBs do not support nested tables. The problem table contains the scalar information about each diagnosis, while the fault event table contains lists of the events associated with each diagnosis. Both tables are indexed by diagnosis UUID; the fault event table utilises a second scalar index to distinguish between multiple events associated with a diagnosis. In response to the trap above, you might want to know which Automated System Recovery Unit(s) (ASRU(s)) the fault manager believes may have caused the fault. This is just a fancy way of saying we want to know what broke to trigger the diagnosis. Because each ASRU is associated with a fault event, we'll first need to know how many fault events were associated with this diagnosis so that we can then look up each one's ASRU in the fault event table. To do this, we'll use snmpget(1M), delivered by Solaris in /usr/sfw/bin. Of course, you can use any NMS software.

    nms$ snmpget -c public -v 2c stomper \\
        sunFmProblemSuspectCount.\\"a58aa105-4fab-6e16-8557-ab7687113de7\\"
    SUN-FM-MIB::sunFmProblemSuspectCount."a58aa105-4fab-6e16-8557-ab7687113de7" = Gauge32: 1
This diagnosis has only one fault event associated with it. To look up the ASRU, we'll look in the fault event table entry indexed by the UUID and the fault index. Since fault events are indexed starting from 1, we'll need to do:
    nms$ snmpget -c public -v 2c stomper \\
        sunFmFaultEventASRU.\\"a58aa105-4fab-6e16-8557-ab7687113de7\\".1
    SUN-FM-MIB::sunFmFaultEventASRU."a58aa105-4fab-6e16-8557-ab7687113de7".1
    = STRING: cpu:///cpuid=4/serial=23EBEC1505
Most NMSs offer scripting facilities that allow you to perform actions similar to these in response to a trap. Alternately, you could poll the data on a regular basis. Many impementations do both, using polling to offset the risk of losing traps, which like all SNMP datagrams do not offer reliable transmission. SNMPv3 informs, also known as acknowledged notifications, offer only a partial remedy to this problem, and are not supported by snmp-trapgen at this time.

A polling NMS may wish to poll the systemwide faulty component count, provided by the MIB as sunFmFaultCount. An increase in this gauge without a corresponding problem trap is a good indication that the trap has been lost. More details about devices the fault manager believes to be in degraded or faulted states is available via the sunFmResourceTable; walking this table provides a ready - and remote - answer to the common question "What's broken on that machine?" For this, we use the snmpwalk(1M) utility:

    nms$ snmpwalk -c public -v 2c stomper sunFmResourceTable
    SUN-FM-MIB::sunFmResourceFMRI.1 = STRING: cpu:///cpuid=4/serial=23EBEC1505
    SUN-FM-MIB::sunFmResourceStatus.1 = INTEGER: degraded(3)
    SUN-FM-MIB::sunFmResourceDiagnosisUUID.1 = STRING:
        "a58aa105-4fab-6e16-8557-ab7687113de7"
Finally, the sunFmConfigTable offers remote access to the same information provided by fmadm(1M)'s config subcommand; like the other tables, it can be accessed using snmpget(1M), snmpwalk(1M), or any other SNMP-compatible NMS implementation. You can find the complete fault management MIB at the Fault Management community site, and in build 33 and later at /etc/sma/snmp/mibs/SUN-FM-MIB.mib.

[0] The bug should be visible, but it isn't. This is itself a bug, which the SFW team is working to fix.

Friday Jan 27, 2006

More on Drivers

For those who attended the SVOSUG meeting last night and are looking for boilerplate code similar to that Max presented, you can find it in the Device Driver Tutorial. This gentle introduction also includes a trivial but functional pseudo device implementation.

Monday Dec 05, 2005

GCC inline assembly, part 2

Long ago, I promised to write more about gcc inline assembly, in particular a few cases that are tricky to get right. Here, somewhat belatedly, are those cases. These examples are taken from libc, but the concepts apply to any inline assembly fragments you write for gcc. As I mentioned previously, these concerns apply only to gcc-style inlines; the Studio-style inline format doesn't require that you use this same level of caution. gcc expects you to write assembly fragments (even in a "separate" inline function) as if they are logically a part of the caller. That is, the compiler will allocate registers or other appropriate storage locations to each of the input and output C variables. This requires that you instruct the compiler very carefully as to your use of each variable, and the variables' relationships to one another. The advantage is much better register allocation; the compiler is free to allocate whatever registers it wishes to your input and output variables in a manner that is transparent to you. Instead, Studio requires that you code the fragment as if it were a leaf function, so the compiler does not do any register allocation for you. You are permitted to use the caller-saved registers any way you wish, and even to use the caller's stack as if you are in a leaf function. Arguments and return values are stored in their ABI-defined locations. Depending on the optimization level you use, this can be wasteful of registers (though the peephole optimizer can often clean up some of this waste) and can also make writing the fragment much more difficult. In exchange, however, you don't have to be nearly as careful to express the fragment's operation to the compiler.

Inputs, Outputs, and Clobbers (oh my!)

Each assembly fragment may have any or all of outputs, inputs, and clobbers. Each input and output maps a C variable or literal to a string suitable for use as an assembly operand. These operands can then be referenced as %0, %1, %2, etc. These are ordered beginning from 0 with the first output, followed by the inputs. Alternately, newer versions of gcc allow the use of symbolic names for each input and output. Clobbers are somewhat different; they express the set of registers and/or memory whose values are changed by the fragment but are not expressed in the outputs. Inputs which are also changed must be listed as outputs, not clobbers. Normally, the clobbers include explicit registers used by certain instructions, but may also include "cc" to indicate that the condition code registers are modified and/or "memory" to indicate that arbitrary memory addresses have had their contents altered.

Constraints

Outputs and inputs are expressed as constraints, in a language specifying the type of operand that will contain the value of a variable. Common constraints include "r", indicating that a general register should be allocated, and "m" indicating that some type of memory location should be used. The complete list of constraints is found in the gcc documentation. These constraints may contain modifiers, which give gcc more information about how the operand will be used. The most common modifiers are "=", "+", and "&". The "=" modifier is used to indicate that the operand is output-only; it may appear only in the constraint for an output variable. Even if the constraint is applied to a variable containing an existing value in your program, there is no guarantee that it will contain that value when your assembly fragment is executed. If you need that, you must use the "+" modifier instead of "="; this tells the compiler that this operand is both an input and an output. Nevertheless, the variable with this constraint is provided only in the outputs section of the fragment's specification. An alternate way to express the same thing is provided in the documentation. Note that providing the same variable as both an input and an output does not guarantee you that the same location (register, address, etc.) will be used for both of them. Thus the following is generally incorrect:

static inline int
add(int var1, int var2)
{
	__asm__(
		"add	%2, %0"
	: "=r" (var1)
	: "r" (var1), "r" (var2));

	return (var1);
}
The "&" modifier is used on an output operand whose value is overwritten before all the input operands are consumed. This exists to prevent gcc from using the same register for both the input and output operands. For example, for swap32() (see also the Studio inline function), we might think to write:
extern __inline__ uint32_t
swap32(volatile uint32_t \*__memory, uint32_t __value)
{
	...
	uint32_t __tmp1, __tmp2;
	__asm__ __volatile__(
		"ld [%3], %1\\n\\t"
		"1:\\n\\t"
		"mov %0, %2\\n\\t"
		"cas [%3], %1, %2\\n\\t"
		"cmp %1, %2\\n\\t"
		"bne,a,pn %%icc, 1b\\n\\t"
		"  mov %2, %1"
		: "+r" (__value), "=r" (__tmp1), "=r" (__tmp2)
		: "r" (__memory)
		: "cc");
	return (__tmp2);
}

But suppose gcc decided to allocate o0 to both __tmp1 and __memory. This is allowable, because the "=r" constraint implies that the corresponding register is set only after all input-only operands are no longer needed (input/output operands obviously don't have this problem). In the case above, the first load would clobber o0 and the cas would operate on an arbitrary location. Instead, we must write "=&r" for both __tmp1 and __tmp2; neither variable may safely be allocated the same register as the input operand.

Bugs caused by omitting the earlyclobber are painful to track down because they often appear and disappear from one compilation to the next as entirely unrelated code changes cause increases or decreases in register pressure.

This is not an academic concern. Consider this example program:

#include 

static __inline__ void
incr32(volatile uint32_t \*__memory)
{
        uint32_t __tmp1, __tmp2;
        __asm__ __volatile__(
        "ld [%2], %0\\n\\t"
        "1:\\n\\t"
        "add %0, 1, %1\\n\\t"
        "cas [%2], %0, %1\\n\\t"
        "cmp %0, %1\\n\\t"
        "bne,a,pn %%icc, 1b\\n\\t"
        "  mov %1, %0"
        : "=r" (__tmp1), "=r" (__tmp2)
        : "r" (__memory)
        : "cc");
}

uint32_t
func(uint32_t x)
{
        uint32_t y = 4;
        uint32_t z = x + y;

        incr32(&y);

        z = x + y;

        return (z);
}
gcc compiles this (use -O2 -mcpu=v9 -mv8plus) into:
func()
    func:                   9c 03 bf 88  add          %sp, -0x78, %sp
    func+0x4:               9a 10 20 04  mov          0x4, %o5
    func+0x8:               90 02 20 04  add          %o0, 0x4, %o0
    func+0xc:               da 23 a0 64  st           %o5, [%sp + 0x64]
    func+0x10:              82 03 a0 64  add          %sp, 0x64, %g1
    func+0x14:              c2 00 40 00  ld           [%g1], %g1	<===
    func+0x18:              9a 00 60 01  add          %g1, 0x1, %o5
    func+0x1c:              db e0 50 01  cas          [%g1] , %g1, %o5	<= SEGV
    func+0x20:              80 a0 40 0d  cmp          %g1, %o5
    func+0x24:              32 47 ff fd  bne,a,pn     %icc, func+0x18
    func+0x28:              82 10 00 0d  mov          %o5, %g1
    func+0x2c:              81 c3 e0 08  retl         
    func+0x30:              9c 23 bf 88  sub          %sp, -0x78, %sp

In this case, gcc has allocated g1 to both __tmp1 and __memory, and o5 to __tmp2. Note the highlighted instructions: the initial load destroys the value of g1, and the subsequent cas will attempt to operate on whatever address was stored at \*__memory when the fragment began. In this example, that value will be 4 (g1 is assigned sp+0x64, which is simply the address of y). This program is compiled incorrectly due to improper constraints, and will cause a segmentation fault if the code in question is executed.

If instead we use "=&r" for both __tmp1 and __tmp2, gcc generates the following code:

func()
    func:                   9c 03 bf 88  add          %sp, -0x78, %sp
    func+0x4:               9a 10 20 04  mov          0x4, %o5
    func+0x8:               90 02 20 04  add          %o0, 0x4, %o0
    func+0xc:               da 23 a0 64  st           %o5, [%sp + 0x64]
    func+0x10:              82 03 a0 64  add          %sp, 0x64, %g1
    func+0x14:              d8 00 40 00  ld           [%g1], %o4	<===
    func+0x18:              9a 03 20 01  add          %o4, 0x1, %o5
    func+0x1c:              db e0 50 0c  cas          [%g1] , %o4, %o5	<= OK
    func+0x20:              80 a3 00 0d  cmp          %o4, %o5
    func+0x24:              32 47 ff fd  bne,a,pn     %icc, func+0x18
    func+0x28:              98 10 00 0d  mov          %o5, %o4
    func+0x2c:              81 c3 e0 08  retl         
    func+0x30:              9c 23 bf 88  sub          %sp, -0x78, %sp

This code now assigns o4 to __tmp1, which eliminates the problem described above. This function, however, still does not do the right thing. Why not?

Reloading

Compilers keep track of where each live variable in the program can be found; many variables can be found both at some memory location and in a register. Sometimes, the compiler chooses to use a register for a different variable, and stores the value back to its memory location (if it has changed) before doing so. Later, if this value is needed, the value must be loaded back into a register before being used. This is known as reloading. Other reasons reloading may be required include a variable's declaration as volatile and the case that concerns us here, a variable's modification via side effects.

In the example above, incr32() is actually operating on a memory address, not a register. So why did we assign __memory the "r" constraint instead of more correctly expressing the constraint as "+m" (\*__memory)? It turns out that the "m" constraint allows a variety of possible addressing modes. On SPARC, this includes the register/offset mode (such as [%sp+0x64]). This is fine for instructions like ld and st, but the cas instruction is special: it allows no offset. No constraint exists to describe this condition; the "V" constraint is clearly similar but is not correct; a bare register ([%g1]) is an offsettable address, so "V" would actually exclude the case we want. Conversely, "o", the inverse constraint of "V", includes the register/offset addressing mode we specifically wish to exclude. So, the only way to express this constraint is "r". But this does nothing to capture the fact that although the pointer itself is not modified, the value at \*__memory is altered by the assembly fragment. Is this a problem? Let's look at the assembly generated for func() a little more closely:

func()
    func:                   9c 03 bf 88  add          %sp, -0x78, %sp
    func+0x4:               9a 10 20 04  mov          0x4, %o5
    func+0x8:               90 02 20 04  add          %o0, 0x4, %o0	<===
    func+0xc:               da 23 a0 64  st           %o5, [%sp + 0x64]
    func+0x10:              82 03 a0 64  add          %sp, 0x64, %g1
    func+0x14:              d8 00 40 00  ld           [%g1], %o4
    func+0x18:              9a 03 20 01  add          %o4, 0x1, %o5
    func+0x1c:              db e0 50 0c  cas          [%g1] , %o4, %o5
    func+0x20:              80 a3 00 0d  cmp          %o4, %o5
    func+0x24:              32 47 ff fd  bne,a,pn     %icc, func+0x18
    func+0x28:              98 10 00 0d  mov          %o5, %o4
    func+0x2c:              81 c3 e0 08  retl         			<===
    func+0x30:              9c 23 bf 88  sub          %sp, -0x78, %sp

We see that gcc has assigned z the o0 register, which is not surprising given that it's the return value. But after o0 is set to x + 4 at the beginning of the function, it's never set again. The line z = x + y has been discarded by the compiler! This is because it does not know that our inline assembly modified the value of y, so it did not reload the value and recalculate z.

There are two ways we can correct this problem: (a) add a "+m" output operand for \*__memory, or (b) add "memory" to the list of clobbers. This is a special clobber that tells gcc not to trust the values in any registers it would otherwise believe to hold the current values of variables stored in memory. In short, this clobber tells gcc that all registers must be reloaded if the correct value of a variable is required. This is somewhat inefficient when we know which piece of memory has been touched, so (a) is preferable for better performance. Whichever solution we choose, gcc now compiles our code to:

func()
    func:                   9c 03 bf 88  add          %sp, -0x78, %sp
    func+0x4:               9a 10 20 04  mov          0x4, %o5
    func+0x8:               98 10 00 08  mov          %o0, %o4
    func+0xc:               da 23 a0 64  st           %o5, [%sp + 0x64]
    func+0x10:              82 03 a0 64  add          %sp, 0x64, %g1
    func+0x14:              d6 00 40 00  ld           [%g1], %o3
    func+0x18:              9a 02 e0 01  add          %o3, 0x1, %o5
    func+0x1c:              db e0 50 0b  cas          [%g1] , %o3, %o5
    func+0x20:              80 a2 c0 0d  cmp          %o3, %o5
    func+0x24:              32 47 ff fd  bne,a,pn     %icc, func+0x18
    func+0x28:              96 10 00 0d  mov          %o5, %o3
    func+0x2c:              d0 03 a0 64  ld           [%sp + 0x64], %o0	<===
    func+0x30:              90 03 00 08  add          %o4, %o0, %o0	<===
    func+0x34:              81 c3 e0 08  retl         
    func+0x38:              9c 23 bf 88  sub          %sp, -0x78, %sp

Note the reload, which will now return the correct result. There are actually two other ways to correct this, although the use of "+m" is the most correct. First, we could declare z to be volatile in func(). This would force gcc to reload its value from memory any time that value is required. Use of the volatile keyword is mainly useful when some external thread (or hardware) may change the value at any time; using it as a substitute for correct constraints will cause unnecessary reloading, degrading performance. Second, and perhaps best of all, the compiler could be modified to accept a SPARC-specific constraint for use with the cas instruction, one which requires the address of the operand to be stored in a general register.

You can find more inline assembly examples in libc (math functions), MD5 acceleration, and the kernel illustrating these concepts. Be sure to read and understand the documentation completely before writing your own inline assembly for gcc, and always test your understanding by constructing and compiling simple test programs like these.

Tuesday Aug 16, 2005

Broken allocators and paleolithic debugging strategies

Not so long ago I was looking through Solaris's shells for memory allocators - functions that perform tasks similar to malloc(3c). These functions often store the size of the allocated block at the beginning of each block; if that size is stored as a 4-byte value, the return value from the allocator may not be aligned on an 8-byte boundary. This is a major problem on SPARC, because it's not uncommon to allocate structs or unions containing types that require 8-byte alignment, especially long long. As it turns out, gcc correctly assumes that long long variables are aligned on 8-byte boundaries and uses the ldd and std instructions to access them. Our Studio compiler doesn't; it always issues two ld or st instructions. The result is that programs using this kind of allocator can crash when built with gcc but not with Studio, not a pleasant condition.

As part of my search, I found that, indeed, the Bourne and Korn shells have some alignment problems. Though these are bugs, we've decided that there's no reliable way to find all possible bugs of this type, so we worked around them in the compiler as well as fixed the ones we've found. This is, if nothing else, a good argument against compilers that "help" programmers by covering up this kind of error. But the best prize of all wasn't the kind of problem I was looking for, but rather this gem from the C shell:

        showall(av);
        printf("i=%d: Out of memory\\n", i);
 	chdir("/usr/bill/cshcore");
 	abort();

This is the systems programming equivalent of finding a live wooly mammoth contentedly smoking a cigar in your recliner. Unfortunately, there's no way to trigger this behaviour, as it's protected by the "debug" preprocessor symbol, which we never set in a normal build. Nevertheless, thanks to OpenSolaris, you can see it for yourself.

We harp incessantly on the need to be able to debug production code, with no recompilation needed; there are a number of better ways to debug this particular condition. For example, you could use the DTrace pid provider to stop a csh process when nomem() is called, and even provide a backtrace. If that weren't enough, you could then use mdb(1) to debug the problem in greater detail, or gcore(1) to produce a core dump. But the best part, the real joy, if you'll pardon the pun, is the chdir call. Clearly the purpose was to drop core in a predictable location for later analysis by the author. I think you'll find that coreadm(1m), along with other corefile improvements, offers a far more flexible and powerful way to accomplish this - and it complements nicely the other debugging strategies I mentioned above.

Tuesday Aug 02, 2005

Premium Drinks

Tuesday and Wednesday nights (after the extravaganza on Tuesday and the OpenSolaris BOF on Wednesday) we'll be convening for potent beverages, good food, and unique and amusing company. I'll be at the Lloyd Center DoubleTree in downtown Portland, OR, room 1560. Expect other OpenSolaris personalities to be present. Laura tells me that souvenir shot glasses are among the after-party swag collection, so don't miss out.

Monday Aug 01, 2005

OpenSolaris at OSCON

Those of you in or near Portland, Oregon are encouraged to come and see us at OSCON this week. Most of the conference is at the Convention Center this year (use the helpfully-named Convention Center train stop). Sun will have a booth in the exhibit hall starting Wednesday, and we're giving a few talks as well. In particular, join Bryan and me for a free tutorial on building, installing, and developing with OpenSolaris using DTrace, mdb, and more. That will be held Tuesday at 1:30pm in room D140. Then on Wednesday, I'll be giving a short talk on the status of OpenSolaris at 2:35pm in Portland/255, and we'll have a BOF at 8:30pm. Thursday, don't miss Bryan's short talk on DTrace at 4:30pm.

Even if you can't make the conference, you're welcome to join me for a beer. Send me mail at wesolows at eng dot sun dot com if you're interested, or leave a message for me at the 5th Avenue Suites.

Tuesday Jun 14, 2005

The First OpenSolaris Project: GCC Support

The First OpenSolaris Project: GCC Support

OpenSolaris is (finally) available. I've been working on this every day I've been with Sun, though others have spent years on the effort, and it's an amazing milestone. Unlike most launches, though, this is the beginning of a new effort rather than the end of one. As much as we've done already, there's far more left to be done before OpenSolaris can fulfil all our promises and achieve all our goals.

One promise we have fulfilled today is our commitment to make OpenSolaris accessible to people without the money or desire to buy compilers. Since most of Solaris is normally built with the Sun Studio compilers, this meant we'd need either to provide the compilers on the same terms as Solaris (also required to build OpenSolaris sources), or modify the sources to build and work with the GNU C compiler, available with source and free of charge under the terms of the GNU GPL. For reasons more illustrative of bureaucracy and human nature than of technological difficulties, we were unsure almost until the moment of launch whether we would be able to provide the Studio compilers under acceptable terms; therefore, another engineer and I have spent the last two and a half months porting OpenSolaris to gcc.

At this point I had a nice writeup on inline assembly differences between the Studio and GNU compilers. But it relies on source code that isn't available yet - namely, the gcc-specific inline assembly files. So instead I'll talk about why it happened that way and why it's actually a good thing. I'll also talk about some straight-up bugs we found in the process of porting.

We received word that a final Studio license had been agreed upon on June 3 - just 11 days ago! The license is free-as-in-beer and although somewhat vague seems reasonable enough. Of course, I prefer using only Free Software and promoting it whenever possible (as we're going with OpenSolaris), so I'd really rather use gcc. Our plan of record was to make a merged workspace available as "official" OpenSolaris. There were three sets of changes that needed to be merged together in the last three days leading up to launch: the gcc changes, which edit about 2500 files (mostly to fix compiler warnings), a large wad of renames to support the separation of code we're releasing now from that which we're hoping to release later (thousands of renames), and the coup d'grace, the addition of the CDDL license block to over 24,000 files. In the end, this gigantic 3-way merge proved impractical: there were over 1700 conflicts to resolve. Most are trivial and can easily be automerged by TeamWare, our revision control system, but the sheer volume and shortened schedule would have made adequate testing impossible.

Instead of the three-way merge, then, we elected to take the minimum amount of change we could: the addition of the CDDL blocks and the separation of released from unreleasable source. That meant gcc support would not ship in the "official" sources - but it could still be made available to the developer community. This is important for several reasons - first, it illustrates an important principle: FCS quality all the time. That is, if it's not good enough for a customer, it's not good enough to be putback. Since there was no doubt in anyone's mind that the gcc work was not ready for either, that meant it also wasn't good enough to call OpenSolaris. Second, it offers us an opportunity to provide a glimpse into the way projects work. One of the most common questions we get is "so, if the gate always has to be golden, how does any major work ever get done?" Like most people, we do major work in "branches" off the trunk. TeamWare supports children of children and merging of independent workspaces with common ancestry, so that no complicated branching apparatus is needed as for CVS. What will be available on the gcc project page will be that project gate. You're invited to participate - there are over 300 mostly very small bugs to fix.

One of the most significant kinds of bug we found were programs writing into string constants, confirming Osborne's Law. These programs ordinarily work properly because the Studio compilers place the string constants in the .data section or some other writable data section. The flag -xstrconst changes this behaviour, placing the strings in .rodata or a similar read-only segment and thus also allowing them to be shared. This reduces runtime memory usage but comes at a cost: buggy programs that attempt to write to the constant strings will trigger a segmentation violation and normally die. gcc acts as if this flag were always on, and applies it to other const data types as well. The end result is greater enforcement of correctness at the cost crashes.

Fortunately fixing these is very easy. For example, I fixed bug number 6281909 (you're supposed to be able to see bugs, too, but it doesn't seem to include the bugs of interest) by fixing the selector function not to assume it can write '=' and '\\0' into its arguments. Note that the correct use of 'const' can help prevent this kind of problem.

The original article on inline assembly will appear when the source it references appears - and you can help make that happen sooner: check out the gcc project page.


Technorati Tag:
Technorati Tag:

Friday Apr 15, 2005

DTrace is part of this complete operating system

Earlier this week, Mr. Vaughan-Nichols at eWeek wrote a largely inaccurate and needlessly hostile article about the CDDL, and our own Andy Tucker called him on a few points. Without bothering to correct that article or respond, he went back at it again on Wednesday, this time giving air time to SCO and their blessing of the OpenSolaris program. Why Mr. McBride of SCO felt the need to give this "blessing" is unclear; Sun obviously believes it has the rights needed to make the sources to nearly all of Solaris available under whatever license(s) we choose. Without those rights, no blessing would be sufficient; with them, none is necessary. I'll chalk this up to SCO taking whatever opportunity it can to appear relevant, especially as they continue to struggle in both the marketplace and the courtroom.

Enough of that. Instead, I'd like to focus on the most obvious and significant error in this article: the assertion that

"To date, though, the only released components of OpenSolaris are programs, such as DTrace, which aren't parts of the operating system."

We don't need to be too picky about what constitutes an operating system; even the most pedantic would surely agree that a component which spans the system from user applications to the heart of the kernel is part of the operating system. Under even an extremely narrow definition, DTrace is very much a part of the Solaris operating system - and therefore also of OpenSolaris technology. Our release of DTrace includes the sources for not just the standalone program dtrace(1M), but also all of the following:

  • The userland library libdtrace(3LIB) which provides most of dtrace(1M)'s functionality
  • Three other userland programs: lockstat(1M), plockstat(1M), and intrstat(1M), which are implemented using DTrace
  • Several kernel modules: dtrace(7D), fasttrap(7D), fbt(7D), lockstat(7D), profile(7D), sdt(7D), and systrace(7D); these implement the kernel portions of DTrace
  • Code added to the kernel itself to support dtrace, such as usr/src/uts/common/os/dtrace_subr.c
  • Two additional private user libraries which provide access to Compact C Type Format (CTF) data and the proc(4) filesystem
  • Small programs demonstrating the D language and DTrace functionality
  • A variety of headers and glue

It should be apparent that this is far more complex a subsystem than just one standalone user program. In fact, the source to dtrace(1M) is a single file out of 345 we released, and constitutes only 1431 of 102,163 lines of code (about 1.4%) in this initial release. It dtrace(1M) were simply an ordinary user program, it would not require over 100,000 lines of additional code - including over 32,000 in the kernel - to make it work.

As a final example, observe this comment block from usr/src/uts/os/common/dtrace_subr.c:

/\*
 \* Making available adjustable high-resolution time in DTrace is regrettably
 \* more complicated than one might think it should be.  The problem is that
 \* the variables related to adjusted high-resolution time (hrestime,
 \* hrestime_adj and friends) are adjusted under hres_lock -- and this lock may
 \* be held when we enter probe context.  One might think that we could address
 \* this by having a single snapshot copy that is stored under a different lock
 \* from hres_tick(), using the snapshot iff hres_lock is locked in probe
 \* context.  Unfortunately, this too won't work:  because hres_lock is grabbed
 \* in more than just hres_tick() context, we could enter probe context
 \* concurrently on two different CPUs with both locks (hres_lock and the
 \* snapshot lock) held.  As this implies, the fundamental problem is that we
 \* need to have access to a snapshot of these variables that we _know_ will
 \* not be locked in probe context.  To effect this, we have two snapshots
 \* protected by two different locks, and we mandate that these snapshots are
 \* recorded in succession by a single thread calling dtrace_hres_tick().  (We
 \* assure this by calling it out of the same CY_HIGH_LEVEL cyclic that calls
 \* hres_tick().)  A single thread can't be in two places at once:  one of the
 \* snapshot locks is guaranteed to be unheld at all times.  The
 \* dtrace_gethrestime() algorithm is thus to check first one snapshot and then
 \* the other to find the unlocked snapshot.
 \*/

This comment, while arcane, is clear by itself, so I will not attempt to add to it. I will only point out that if DTrace were not a part of the operating system, it would not need to concern itself with the locking rules for updates to the high-resolution system timers. Further examples of DTrace's intimate association with core features of the Solaris kernel and userland libraries can easily be found by examining the sources.

Sun's DTrace experts have written extensively about their creation [more here and here to note just two] and provided a highly detailed reference manual. While much of this material may not be in a format which is accessible to the layman, even a cursory overview of the source we are offering and the breadth and depth of publications on the topic should be sufficient to satisfy one that DTrace is very much a part of the operating system. Perhaps Mr. Vaughan-Nichols was simply unfamiliar with the offering; in that case I would invite him to download the sources and inspect them himself, and to seek the opinions of expert engineers before making further claims of this sort. DTrace is very much a part of Solaris, and while we have much more to do, releasing it as open source was no trivial step.

Thursday Dec 23, 2004

Linus on Solaris

Most people have probably read the recent Linus interview, in which he has a number of things to say about Linux, Solaris, and software development. Like any interview, it contains some interesting assertions, some obvious filler, and some real head-scratchers. Many in the Solaris community have expressed dismay or anger over some of his remarks, but rather than add to that, I'd like to examine some internal contradictions in Linus's statements and try better to understand why he's made them. As we ready OpenSolaris for public consumption and contribution, it's important to observe how similar development systems work and take steps to avoid difficulties encountered by other projects. Linus's comments indicate that, indeed, the structures and processes in place to serve Linux development are imperfect. We will be well-served to learn from this.

One of the head-scratchers is his assertion that he's not interested in Solaris because he feels it offers nothing of value that isn't already in Linux. This conclusion might be less baffling, though no less disappointing, if he'd actually examined the code, the feature set, and then made up his mind. But he admitted openly that he probably won't even look at the code, and instead will rely on others to tell him if it contains ideas worth considering. I really have to wonder about this approach, especially given his later comments concerning the reason for adding a feature to a system. We certainly agree with him that system design is about solving problems, not just doing something new and different for its own sake. Features don't get added to Solaris if they don't serve some useful purpose, fill some hole for developers, users, or both. It's difficult to believe that Solaris developers and users have problems to solve that differ greatly from those of Linux developers and users. In fact, as a long-time Linux developer myself, I can say with some confidence that the challenges are the same. So why does Solaris offer tools like kmdb, dtrace, and crash dumps, while Linus either refuses to integrate similar functionality or claims he hasn't heard of the problems these tools help to solve?

One possible reason is that distributions sometimes provide parts of these feature sets, so that users never even realize their absence in Linux proper. Linus talked about the distributors serving a valuable function, buffering developers from customers. But perhaps in that process, valuable information is not making its way back to Linus. The Linux development community would be well-served by talking to ordinary systems administrators now and then. Another possibility is that users and administrators can't, won't, or don't effectively communicate the problems they are trying to solve. But why don't Solaris users seem to have this problem? Do Linux distributors simply not listen? Or perhaps these decisions are really based on ideology, as so many Linux detractors claim. Regardless, a sober assessment of users' real-world needs might well reveal that Linus and others still have much work to do (as do Solaris developers), and that some of the changes they ought well to consider have already been made in other systems. The solutions Linus might choose may well be quite different from those chosen by Sun, but disregarding or remaining ignorant of the challenges is an opportunity lost to innovate and improve. What kind of engineer willingly passes up that opportunity?

If NIH is in fact "a disease" - a point which ought to solicit universal agreement, I'm left to wonder why Linus would pass up an opportunity to examine the works of other engineers. If he does in fact rely on others to tell him about valuable features in similar systems, something in that process is broken. If he wants to make sure Linux can solve all the problems Solaris can, I'd suggest he look closely at what's been done here. The code isn't even needed for this - a quick glance at public white papers would be sufficient to understand many of the problems Solaris engineers have been working to solve. If he doesn't believe these problems exist, a reality check is in order.

There are lessons here, of course. One of them is that systems developers must not lose touch with the problems they're supposed to solve. It pays to listen. Another lesson is that a process which prevents useful features from being implemented is broken, and someone has to be willing to recognize and correct such a process. If distributions take on the work of making a usable system and interacting with customers, engineers risk losing sight of appropriate goals. This is avoidable, but that it appears to be occurring implies that the relationship among Linux (the codebase), its distributors, and its developers (many of whom work for distributors) is defective in some way.

I'm cheered by the prospects for OpenSolaris to avoid these pitfalls, especially if we recognize them and take proper action. I hope we as a community will remain cognizant that they have hindered other large projects before ours, even those with leaders of Linus's stature.

Friday Aug 13, 2004

A Sense of Entitlement

I've finally decided to write a bit about a topic that has bothered me for many years as a participant in the Free Software community (it applies equally well to Open Source if you prefer): User Entitlement.

Some of you out there know what I mean. You maintain an application in your spare time as a volunteer. You field trouble reports and RFEs and do your best to implement, at minimum, the suggestions that matter to you, all while holding down a job and meeting your personal and family obligations. But for a minority of users, that's not enough; they expect you to implement features that don't interest you and fix bugs you can't reproduce. In short, they expect you to provide support. While one tries never to be rude, at some point the urge to point out the obvious becomes overwhelming: you have the source, you obviously care a lot about this, and nobody else has the time or inclination to do anything about it! Instead of repeatedly asking when I'm going to implement your change, why not implement it yourself and send me a patch?

Of course, the inevitable response to this suggestion is that the user in question is not a programmer. This is a subtle but important fact that has changed the way the community functions over the years; in the beginning, we were all programmers. Now programmers are a minority of Free Software users, just as we are a minority of software users in general. The commons model breaks down under these conditions; many users have little to offer the community as a whole. Bug reports and testing are valuable services, true, but some users are just that - users. Not testers. Not contributors. Not developers. Just users; they use the software, expect (rightly) that it will work as advertised, and become unhappy and demanding if it does not. This looks a lot more like a customer than the fellow co-op shareholder the model would suggest.

I don't mean to suggest that this behaviour is representative, but it certainly has increased as the pool of users has expanded. How will Free Software projects in the future deal with the influx of Users? Much work has been done, mostly in economics, on the subject of managing cooperatives and commons; I believe this work is directly relevant to the Free Software community. I'll get more into some of that work in my next post.

About

wesolows

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today