A Framework for Software Product Line Practice, Version 5.0
Software System Integration
Software system integration refers to the practice of combining individually tested software components into an integrated whole. Software is integrated when components are combined into subsystems or when subsystems are combined into products. Components may be integrated after all are implemented and tested as in a waterfall model or a "big bang" approach. In either, software system integration appears as a discrete step toward the end of the development life cycle between component development and integration testing. Continuous integration is a much less risky approach wherein the components and subsystems are integrated as they are developed into multiple working mini-versions of the system. Object technologists were early proponents of incremental development, and object-oriented development methods, such as the Unified Process [Alhir 2002a], are based on the principle of ongoing integration practices.
Continuous integration offers several advantages over a waterfall, "big bang" approach. Developers split a product into several builds, or partial products, that can be integrated individually. These builds may be chunked into "vertical" increments, covering subsystems, or increments may cross subsystem boundaries to produce a partial end-to-end product. In both cases, integration is iterative in that a sequence of incremental builds yields results successively closer to the desired product. Customers gain an advantage in seeing a working product early in the development, and developers can verify performance or other quality factors in a working environment, rather than relying on models or simulations. Continuous integration also decreases integration risk, because developers can identify integration problems, often the most complex to address, during early iterations of software integration.
Integration is bound up in the concept of component interfaces. Recall from the "Architecture Definition" practice area that an interface between two components is the set of assumptions that the programmers of each component can safely make about the other component [Parnas 1972a]. These assumptions include the component's behavior, the resources it consumes, how it acts in the face of an error, and other assumptions that should be documented as part of a component's interface. This definition is in stark contrast to the simplistic (and quite insufficient) notion of interface that merely refers to the "signature" or syntactic interface that includes only the component's name and parameter types. This definition of interface may let two components compile together successfully, but only the Parnas definition (which subsumes the simpler one) defines how two components work together correctly. When interfaces are defined thoughtfully and documented carefully, integration proceeds much more smoothly, because they define how the components will connect to and work with each other.
Aspects Peculiar to Product Lines
In a product line effort, we identify two stages of software system integration:
Core assets are integrated into the core asset base as part of core asset development, including review or test: This merging may take the form of "pre-integration" and cover more than software components. As assets are developed and reviewed or tested, they migrate to the asset base. Under pre-integrationa continuous integration approachthe assets may be used to support other asset development or partial product development. The production strategythe overall approach for realizing core assetsguides this activity.
Core assets are integrated during the building of an individual product: Here, the production plan guides the process, defining how developers will select the appropriate core assets from all the available ones and construct a product. All assets such as requirements, architecture, processes, and testing assets contribute to the construction and must be integrated into a delivered product. Integration may involve tailoring assets according to planned variations or developing product-unique software components, requirements, tests, and so forth.
For the core asset base, pre-integrating as many of the software core assets as you can makes product building a much more economical operation [Clements 2002c, p. 118]. This pre-integration can yield a "virtual" or test product that mirrors an actual, deliverable product. In both core asset base and product integration, you need to consider integration early on in the development of the production plan and architecture for the entire product line. The goal is to make software system integration straightforward and predictable.
In a product line, the effort involved in software system integration lies along a spectrum. At one end, the effort is almost zero. If you know all the products' potential variations in advance, you can produce an integrated parameterized template of a generic system with formal parameters. You can then generate final products by supplying the actual parameters specific to the individual product requirements and launching the construction tool (along the lines of the UNIX Make utility). In this case, each product consists entirely of core components; no product-specific code exists. This is the "system generation" end of the integration spectrum.
At the other end of the spectrum, considerable coding may be involved to bring the right core components together into a cohesive whole. Perhaps the components need to be wrapped, or perhaps new components need to be designed and implemented especially for the product. In these situations, the integration more closely resembles that of a single-system project.
Most software product lines occupy a middle point on the spectrum. Obviously, the closer to the generation side of the spectrum you can align your production approach, the easier integration will be and the more products you will be able to turn out in a short period of time.
However, circumstances may prevent you from achieving pure generation. Perhaps a new product has features you didn't consider when developing the asset base, your application area prevents you from knowing all the variations up front, or the variations are so numerous or complex or interact with each other in such complicated ways that building the construction tool will be too expensive. And perhaps you prefer to produce fewer products over a long time period rather than turn out many products in a short amount of time. In that case, the construction tool may be less appealing.
In software system integration for product lines, the cost of integration is amortized over many products. Once the product line scope, core assets, and production plan have been established in the core asset base, and a few systems have been produced from that base, most of the work to support software system integration for the product line is complete. The interfaces have been defined, and they work predictably. They have been tested. Components work with one another. In subsequent variations and adaptations of the product, there is relatively little software system integration effort when the variations and adaptations occur within components. Even when new components are being added with new interfaces, the models from previous interfaces can and should be followed, thus minimizing the work and the risk of integration. So, in a very real sense, products (after the first one or two) tend to be "pre-integrated" such that there are few surprises when a system comes together.
Application to Core Asset Development
When core assets are developed, acquired, or mined, remember to take integration into account during planning and budgeting. Evaluate any components you buy, mine, or commission for their integrability and granularity. A component is "integrable" if its interfaces (in the Parnas sense) are well-defined and well documented. Such a component may be integrated with other components directly through application programming interfaces (APIs) or potentially through wrapping. Finally, remember that it is generally easier to build a system from small numbers of large, pre-integrated pieces than from large numbers of small, unintegrated components.
Application to Product Development
Software system integration is, or should be, the primary activity in product development within a product line. The core asset base consists of a relatively small number of large-grained assets covering requirements, architecture, components, test plans, test cases, and so forth, along with their respective attached processes. The core assets are engineered to work together in accordance with the product line architecture but still require tailoring and integration to build a product. The attached process guides tailoring and integration at the core asset level. The production plan provides guidance for developing a whole product.
A big benefit of product line practice is that integration during product development becomes a very predictable activity. In the product line, integration is based on the specific tailoring and integration guidance defined by the production plan. This generic production plan guides product developers in the specific steps they must take to tailor the full range of core assets needed in the production of their individual product. Included in this guidance is how to tailor the generic production plan itself for the individual product. Another benefit of product line practice is that software system integration costs tend to decrease for each of the subsequent products in the product line. If the production plan for a specific product calls for the addition of components or internal changes in components, some integration may be required depending on the nature of the changes. These changes are known up front and can be planned along with core asset integration. Finally, in the system generation case, integration becomes a matter of providing values for the parameters and launching the construction tool. The key in all these cases is that the integration occurs according to a preordained and tested scheme.
Interface languages: Languages such as the Interface Description Language (IDL), Object Constraint Language (OCL), and Web Services Definition Language (WSDL) allow you to define interfaces in a way that can be automatically checked for consistency and completeness. Programming languages such as Java allow you to define a compilable specification separate from the body. Java programmers have found that keeping a continuously integrated system using full specifications and stubbed bodies decreases the integration time and costs dramatically. These languages and others do not allow the specification of the full semantic interfaces of components, but the integration bugs they allow you to catch early make using them pay off.
Wrapping: Wrapping, described as a specific practice in the "Mining Existing Assets" practice area, involves writing a small piece of software to mediate between the interface that a component user expects and the interface that the used component comes with. Wrapping is a technique for integrating components whose interfaces you do not control, such as components that you mined or acquired from a third party [Seacord 2001a].
Middleware: An especially integrable kind of architecture employs a specific class of software products to be the intermediaries between user interfaces on the one hand and the data generators and repositories on the other. Such software is called middleware and is used in connection with Distributed Object Technology (DOT) [Wallnau 1997a]. There are several prominent examples of middleware standards and technology such as .NetMicrosoft's middleware to support distributed Web applications. Another collection of proprietary middleware solutions includes those that have grown around the Java programming language, such as Java 2 Enterprise Edition (J2EE). Software system integration involving Web services and service-oriented architectures make up another distributed object computing environment that product line integration must deal with. Middleware is discussed in more detail in the "Architecture Definition" practice area.
System generation: In cases in which all (or most) of the product line variability is known in advance, a new product in a product line can be produced with no software system integration at all. In these cases, it may be possible to have a template system from which a computer program produces the new products in the product line simply by specifying variabilities as actual parameters. Such a program is called a "system generator." One example of such a family of products would be an operating system in which all the variabilities of the system are known ahead of time. Then, to generate the operating system, the "sysgen" program is simply provided with a list of system parameters (such as processor, disk, peripheral types, and their performance characteristics), and the program produces a tailored operating system rather than integrating all the components of an operating system.
FAST generators: Weiss and Lai describe a process for building families of systems using generator technology [Weiss 1999a]. The Family-Oriented Abstraction, Specification, and Translation (FAST) process begins by explicitly identifying specific commonalities and variabilities among potential family members and then designing a small special-purpose language to express both. The language is used as the basis for building a generator. Turning out a new family member (product) is then simply a matter of describing the product in the language and "compiling" that description to produce the product.
The major risks associated with software system integration include
natural-language interface documentation: Relying too heavily on natural language for system interface documentation and not relying heavily enough on the automated checking of system interfaces will lead to integration errors. Natural-language interfaces are imprecise, incomplete, and error prone. As in single-system development, undetected interface errors increase the overall cost of integration. Errors in core assets that remain undetected until integration time also lead to significant repair costs, especially since an asset may be used in multiple systems. Automated tools, however, are more oriented to syntactic checking and less effective at checking race conditions, semantic mismatch, fidelity mismatch, and so forth. Some interface specifications must still be done largely with natural language and are still error prone.
component granularity: There is a risk in trying to integrate components that are too small. The cost of integration is directly proportional to the number and size of the interfaces. If the components are small, the number of interfaces increases proportionally, if not geometrically, depending on the connections they have to each other. This leads to greatly increased testing time. One of the lessons of the CelsiusTech case study was that "CelsiusTech found it economically infeasible to integrate large systems at the Ada-unit level" [Brownsword 1996a]. Although the component granularity is dictated by the architecture, we capture the risk here, because this is where the consequence will make itself known.
variation support: There is a risk in trying to make variations and adaptations that are too large or too different from existing components. When new components or subsystems are added, they must be integrated. Variations and adaptations within components are relatively inexpensive as far as system integration is concerned, but new components may cause architectural changes that structure the product in ways that cause integration problems.
Alhir provides an overview of the Unified Process and its relationship to UML.
Wallnau, Weiderman, and Northrop provide a nicely digestible overview of middleware.
Weiss and Lai describe the FAST process, which includes a generator-building step that essentially obviates the integration phase of product development.