|
|
OpenSolaris Development ProcessVersion 1 [DRAFT]John Beck, Rich Teer, Al Hopper, David Comay, Stephen Hahn, Ed Hunter, Joe Kowalski, Keith Wesolowski, Casper Dik, and Bill Sommerfeld Abstract. This document outlines the OpenSolaris development process: its intent, its general attributes, and its high-level process. The discussion identifies specific changes to effect the extension of existing processes to collaborative development. These recommendations are intended to give all contributors an equitable opportunity to participate in OpenSolaris development without regard for familiarity with or access to existing Sun tools and processes, and to preserve and strengthen existing technical standards. The discussion intentionally avoids addressing implementation issues; the process is independent of the procedures used to select individuals for specific roles and the tools and infrastructure implementing each process step. About this DocumentWe have organized the materials in this document from general to specific. If you are seeking a broad understanding of OpenSolaris and its development processes, you need only read the introductory materials. Step-by-step process overviews follow in subsequent sections, with appendices containing deeper details relevant to particular steps in the process. The specific, detailed steps—and the tools involved—are covered in the OpenSolaris Developer's Guide. In all cases, the processes are described from the implementer's point of view; however, responsibilities of individuals acting in other roles are also described. The glossary contains a list of definitions for commonly-used terms. These terms have a wide array of inconsistent and even conflicting uses throughout the software engineering industry and in various Open Source communities. The definitions are intended to provide an anchor for consistency throughout the rest of the document. The first chapter, "Fundamentals", provides background information, and the rationale for a rigorous and methodical development process. We then describe at a high level the scope and purpose of the OpenSolaris co-development effort, and the relationship between the Solaris Operating System and that effort. Finally, we present a set of high-level technical design and process criteria which have been established in the OpenSolaris code base over a period of years. Design and implementation of software consistent with these principles is the key purpose of OpenSolaris development. Following the background information, we describe the processes involved in managing OpenSolaris code bases. This chapter, "Release Management", includes release naming and numbering and change management rules. Additionally, we describe branch management strategy and release criteria. This discussion is intended to be independent of any particular source code management regime. In the third chapter, "Development Process", we present process flows and narratives from the perspective of an implementer. This process closely mirrors the Software Development Framework (SDF) as currently implemented within the Solaris organization at Sun, modified to incorporate contributions by individuals and teams working without continuous and direct access to one another. It is therefore intended to be appropriate to development dispersed both geographically and organizationally. Additional detail is provided in the areas of testing and quality assurance for contributed implementations. The appendices offer detailed information about each of the steps in the development process, and are referenced liberally throughout the rest of the document. FundamentalsThis chapter provides an overview of what OpenSolaris is, its relationship to the Solaris product from Sun Microsystems, Inc and a description of the design principles to be used by contributors who wish to provide changes back into a release branch of an OpenSolaris consolidation. Finally, it describes a set of principles to be applied when making changes to the development process itself. The OpenSolaris ProjectNote: the following section indicates our original thinking when drafting all these ideas in the summer-fall of 2005. The subsequent section indicates our evolved plans as of October 2008.
Design PrinciplesThe source code which forms the basis of OpenSolaris has been developed for many years using a set of design principles and core values which have been applied earnestly in order to provide users with a rich, coherent and stable platform for developing and running applications. The attributes of this platform and the technologies that make it up are a consequence of these design principles. Software is inherently something which evolves over time and improvements are almost always possible. Indeed, there are parts of OpenSolaris today which do not embody these principles to the extent that they should. Nevertheless, these principles are essential to the character of OpenSolaris and contributors are expected to operate with them in mind when making even the smallest changes. These principles are not unique to OpenSolaris, of course. But they do serve as a cornerstone of first-quality software design. Software is only as usable as how it fits with the rest of the operating system and with the user's expectations of that operating system. For example, new functionality that is delivered rapidly but with no thought on how it will be maintained is likely to be software that does not evolve and eventually falls into disrepair. Another example is some new feature that is novel in its own right, but which cannot be observed or debugged. Such software presents a larger maintenance burden, not only for its original author but also for every other contributor or user who comes in contact with it at some point. Reliability
Availability
Serviceability
Security
Performance
Manageability
Compatibility
Maintainability
Platform Neutrality
Process PrinciplesContributing to OpenSolaris is more than just understanding the design principles of the project. There are also development processes in place to steer, encourage and ultimately enforce future development to adhere to the stated design principles. These development processes must be understood by contributors to the project. The development processes themselves were derived with a number of values in mind including being able to scale from a project spanning many different layers and subsystems down to a simple fix that addresses a very localized defect. Each of these values or process principles is meant to guide the creation of any new processes, with the overriding goal of the continuing evolution of OpenSolaris, developed as efficiently as possible, but with the highest quality in mind. Open
Shrink To Fit
Fair
Deliberate Intent
PoliciesAll OpenSolaris consolidations must conform to the policies below unless a specific exception has been granted by the Governing Board or its designate and documented in the meeting minutes or an approved fast track. These policies each distill the positions of many stakeholders into a minimal requirement for project teams and consolidations in each of the technical areas. It is expected that consolidations, individually or in concert, will reiterate these policies in greater detail for the benefit of contributors. With respect to external code bases, such as open source software generated by third parties, consolidations may choose to relax certain of these policies. For instance, the ON consolidation does not expect integrated open source software to meet the internationalization policy requirements. Accessibility
Architecture Review Committee (ARC)
Architecture Neutral
End of Feature (EOF)
Internationalization (I18N)/Localization (L10N)
IPv6
Consolidation Bug Policy
Licensing of contributions
Release ManagementRelease TypesThe core OpenSolaris value of compatibility and the processes designed to achieve that goal may seem daunting and a hindrance to innovation. This section is intended to provide a very high level overview of the processes and what can be done within them. Pointers to reference documents will be provided for those interested in the details. One should first note that Solaris 10 is a very different beast than Solaris 2.0 was. All the innovation which appeared in the interim was accomplished within the same compatibility constraints in place today. Indeed, these compatibility constraints can serve as a facilitator for innovation, rather than a hindrance. This is referred to as "the Freedom of Constraints": If the interfaces upon which your project depends are constrained, you are free to concentrate on developing your project rather than constantly adapting it to a changing environment. To understand and work within the processes designed to facilitate compatibility, one needs to understand a few basic concepts. It is common throughout the industry to classify product releases as Major, Minor or Micro and reflect this in the "dot" notation we are all familiar with. It is also common throughout the industry that these terms imply little more than a value judgment by someone as to whether a release is small, medium or large. It is just marketing splash. In Solaris, and now OpenSolaris, additional constraints are applied as to which interfaces can change incompatibly in a given type of release. In a Major release, any interface might change incompatibly. In a Minor release, most interfaces will not change incompatibly and in a Micro release even fewer can change incompatibly. Which interfaces are which is made clear by assigning an Interface Taxonomy (reference) level to each interface. More on Interface Taxonomy levels later. Note that in the second paragraph, the references are "Solaris 2.0" and
"Solaris 10". "Solaris 10" would actually be "Solaris 2.10" if one expects
the common Major.Minor notation. However, since Sun made the decision to
not do another Major release of Solaris for the foreseeable future, the '2'
became silent. Note further, that the output of This brings us to the topic of interface stability and the Interface Taxonomy. The Interface Taxonomy document is in the process of being modified. The intent is to simplify it, partly in response to the demands of working in an open environment. Although that document is not yet available in its final form, the substantive changes are known and the following overview reflects those changes. The Interface Taxonomy divides interfaces into two broad categories; Public and Private. Public interfaces are available for anybody to use while the supported use of Private interfaces is limited to a subset of possible consumers. There are several subclasses of both Public and Private interfaces. For Public interface these subclasses reflect in what type of release the interface may change incompatibly. For Private interfaces the subclasses reflect the domain of supported users. It's important to emphasize that Private does not mean secret and conversely visible does not imply Public. Indeed, many Private interfaces are easily discovered by simply examining the header files. The following are the (new) relevant Public taxonomy levels:
The precise terms for the public taxonomy levels is still under discussion, as part of the update to the Interface Taxonomy document. Note that the ability to make an incompatible change in a given release vehicle does not make that a requirement. For example, most interfaces controlled by someone other than the OpenSolaris community are currently classified as Volatile, but synchronization with major incompatibilities introduced by those communities is often deferred until a Minor release is available. Exceptions are made to these rules, but they require an exceptionally good reason. The three generally accepted reasons are:
Other reasons may be accepted, on a case by case basis. For example, we
broke compatibility in It should be noted, that OpenSolaris (and particularly the core ON/SunOS consolidation) make little use of the Uncommitted taxonomy level; it is used primarily by other Sun products which work in a less compatibility constrained market. However, its use is expected to increase for two reasons:
The following are the relevant Private taxonomy levels:
How does this affect the OpenSolaris developer? On the very positive side, you can make informed decisions as to what interfaces you can consume (or even if you are allowed to consume them). This gives you tremendous control over the impact other projects can have on your project. The benefit of this over time can't be understated even though it might seem to be a short term inconvenience. On the negative side, you may have to jump through a few hoops when needing
to modify an interface provided by your project to maintain compatibility.
The classic UNIX example is the Branch ManagementOpenSolaris governing policies presume the existence of multiple concurrent lines of development. A given line of development is associated with some codebase, typically but not always a consolidation. Different consolidations may have related lines of development but need not complete their releases at the same time. These assumptions hold for the existing Solaris development model and many open source development efforts. The terminology used to describe parallel development often depends on the particular source code control model in use; however, underlying work flows tend to be similar. To avoid confusion, see the glossary for definitions of terms. Before a consolidation makes its current release final (the last official snapshot of that release is delivered), the trunk will split into two release branches: the trunk, on which development will be integrated for the next release, and the current release's branch, on which the current release will be completed and made final. At the time of splitting, any version information in the source base will be modified in the trunk to reflect the next release and left unchanged in the previous release branch. A release team is associated with a versioned release; that is, this step may be seen as the creation of a new trunk with a new release team. The existing release team remains responsible for the previous release branch until it is made final. All release branches are subject to the quality criteria outlined later in this document. Despite this, the release team does not bless arbitrary snapshots selected at random. While these snapshots can be used by distributions, only the final snapshot within a release is considered the official publication of the consolidation's contents. Often, given human fallibility, there will be some number of defects in any incomplete release branch which should not be included in an official release. Therefore, when a release is being finalized it is necessary to reduce the rate of change so those defects can be squeezed out. On any branch, in order to manage risk, the release team has the the responsibility and the authority to require that projects integrate only when both the project is ready to integrate and the branch is ready to receive the project. This may mean that a project targeting integration late in a release may be deferred until the next release opens. Exactly when to "split" in the development cycle is a matter of balancing conflicting concerns; splitting near the beginning of the new release increases the period of time during which two Minor release branches are in active development, which divides resources and requires more merging. Splitting near the time when the release is to be completed usually implies that the next release does not open until the previous one is completed, increasing delays in project integration.
Active branches and equivalent releases. The trunks of all consolidations officially comprising OpenSolaris will normally operate under Minor release binding rules at all times, and under the control of their respective release teams. After the split, the pending release branch continues under Minor release binding rules until the release is completed, but integrations should occur at a reduced rate. New development should be targeted at the next release, and the release team for the pending release may refuse integrations for risk management reasons. Optionally, branches may be created from a final release and opened to continuing development. If this occurs, the children are update releases. Each update release has its own release team and operates under Micro release binding rules. Although each update release team may establish its own integration criteria, in general all changes targeting an update release must first integrate into the trunk, and must be backported (rather than merged) to the update release branch. Each of these steps requires independent testing. Most update release teams will require some amount of "soak time" in the trunk; the exact amount of soak time required may depend on the type or scope of change. The release team always has the authority to impose any additional requirements it deems appropriate. Note that in the common case most of the materials generated for a change's integration into the trunk can be reused during the process of integrating into an update. A central concern of the update integration process is ensuring that the change is appropriate for the update release. Release BindingsThe release team defines the appropriate release binding for changes to its branch. This is typically Micro for an update release branch and Minor for all other release branches. A project branch normally operates under the same release binding rules as the release branch to which it is targeted; project teams not targeting integration may apply any release binding they wish to their branch(es). As the definition of release implies, compatibility rules are not in effect within a release, only between a completed release and its completed parent release. These rules do not apply from one snapshot of a release to the next. It's worth noting, however, that some defects are most easily fixed with an incompatible change. If a new interface is integrated into a release branch and then backported to another release branch before a defect of this type is discovered, getting the same incompatible change made to both branches before either one is completed will require substantial coordination. Quality CriteriaOpenSolaris is expected to encompass a number of open source projects which will operate under the OpenSolaris governance model. It is expected that each project will have some freedom in its respective charter which defines how it will operate including in the area of quality assessments and defect management. However, at which time that a project within OpenSolaris wishes to publish its changes via a release branch of a consolidation, including a trunk, the quality of the proposed changes must meet certain quality criteria. Regression Commit CriteriaA regression is an unintended change from correct or often, documented or expected behavior. One type of regression is when the operating system behaves less in accordance with its documented and stated specification. These regressions may be as trivial as a new spurious error message bring printed or as serious as the operating system suffering a catastrophic failure. No change made to a release branch should cause a functional regression of the branch from its current state. Such regressions cause all projects based on OpenSolaris to spend time tracking down potential issues with their work, which wastes the time of a larger part of the community than just the contributor who introduced the regression. This in turn also slows down progress on future development. Functional regressions can be prevented in most cases through the execution of comprehensive and rigorous test plans prior to the commitment of a proposed change. Another type of regression is one in the performance of the system as measured through one or more micro and/or macro benchmarks. Such regressions must be avoided if at all possible and can be prevented through the same testing methodology used to prevent functional regressions. As each release branch will have a set of performance criteria associated with it (which will be part of larger quality criteria for the release), a performance regression caused by a proposed change may be allowed by the release team depending on the criteria of the specific branch and the current state of the release with respect to those criteria. Completeness Commit CriteriaEven if a newly introduced change into a release branch does not cause a regression, it is essential that change does what it is expected to do. New functionality must meet the key requirements that were set out by the project team. This can be established by executing against a rigorous test plan which has been reviewed to be comprehensive and complete. Projects which are unable to show they have met their key requirements will not be permitted to publish their changes via the release branch of a consolidation. In addition, whenever any type of change is made to a release branch, a corresponding change to the existing regression test suites may also be required. This is particularly true when new functionality is added so that future changes, most likely not by the original project, can be tested and shown not to regress the quality of OpenSolaris. Even in the case of fixing existing defects, it is often useful and sometimes will be required to add additional test suites or assertions to existing suites. The key is provide a robust test suite which is available to all OpenSolaris contributors in order to assure to the greatest extent possible that a proposed change is functionally complete and will not cause a regression. SummaryThe quality requirement of OpenSolaris is perhaps best stated as "Production Ready All The Time". The idea is that at any time, the release branch of a consolidation must be of sufficient quality so that it can be used as-is as the basis of a distribution or product. By using rigorous quality criteria, the community can enforce that OpenSolaris avoids as much as possible the Quality Death Spiral. This occurs when contributors and users alike hear that a branch is broken and as a result, they avoid incorporating the most recent changes into their own project area or even stop using the branch itself. This causes less real-world testing to take place, additional bugs are not discovered and the quality of the branch declines even further. When the Quality Death Spiral occurs, it can be difficult to reverse and the time it takes to recover can be very long. Development ProcessThe development process flows are too large to fit on a single flow chart so we split it up into four phases: Idea, Design, Implementation and Integration. As mentioned earlier, one process development criterion for OpenSolaris has been "shrink to fit", meaning that various steps in the process are only applied as needed and skipped if not needed. Note also that developers are allowed to "skip ahead" and do steps out of order, but that would significantly increase the risk of rework. Finally, nothing can replace good judgment in deciding what is and is not needed.
Notes on the colors on the charts below, which indicate scale:
Idea
First, someone has an idea for an enhancement or has a gripe about a defect. The first thing to be done is to see if a bug/RFE is filed and file one if not. The next thing is to announce it somewhere to precipitate discussion, which should help determine the complexity of the proposed change(s), gauge community interest, and identify potential team members. This is where developers find out if they have the support they will need to move forward. If the set of changes is sufficiently large, a team may need to be formed, a project may need to be defined, and design reviewers may need to be identified. Once these are all done to satisfaction, things progress to the Design phase. Design
The Design phase has several things going on in parallel. The first is whether or not a formal design review is even needed; the general answer is No for small RFEs and bug fixes and Yes for medium and large RFEs and projects, but as with much of the process, this is a judgment call. If a formal review is needed, reviewers will need to be identified, a design to review created, etc. If needed, architectural review should occur as well. If there are dependencies, convincing the appropriate stake-holders that it is appropriate is needed. A test plan, however minimal or complex, must be created and approved also. And if needed, a schedule detailing resources etc. should be produced. Implementation
The Implementation phase also has several things going on in parallel. The first and often foremost among these is the writing of the actual code, along with making sure that code adheres to applicable policies, and passes various unit and pre-integration tests. Along with this are writing the documentation and writing the test suites. And in preparation for the Integration phase, code reviewers should be identified. Integration
The Integration phase is to make sure everything that was supposed to be done has in fact been done, which means a lot of review, including code, documentation and completeness. Note that the "Review for Completeness" step is conducted by the Final Approvers; in the Solaris model, these are the C-team and/or the CRT, who exactly they are in the OpenSolaris model is not specified, although expected to be acting in an equivalent role for their consolidation. Once all reviews have been completed, and permission to integrate has been granted, integration can occur. Finally, the change needs to be communicated as needed: heads-up and/or flag-day messages to appropriate communities, and possibly a transfer of information to a support organization. GlossaryApplication Binary Interface (ABI)
Application Programing Interface (API)
Branch
C-Team
Change Review Team (CRT)
Command Line Interface (CLI)
Community
Consolidation
Contributor (a person)
Distribution
Documentation
Fast Track
Flag day
Interface
Interface Taxonomy
Member (a person)
Parent/Child Branches
Product
Project
Project Branch
Project Team
Proto Area
Release Branch
Release
Stable
Trunk
User (a person)
Appendix A. Interface Taxonomy(Incorporate PSARC/2005/220 once complete and opinion available.) |