NEWS AT SEI
This article was originally published in News at SEI on: September 1, 2000
In my Spring 2000 COTS Spot column I explained the role of technology competence in the design of COTS-based systems, and described how to obtain this competence quickly by building toys and model solutions to model problems. In this article I will take the next step and describe how to put model problems into action by using the 3-Rs of design risk reduction in component-based systems: Realize model solutions, Reflect on their utility and risk, and Repair the risks. I will then relate the 3-RS—which is denoted typographically as R3—to increasingly popular iterative software development processes.
The R3 Process
The characteristics of COTS software components—what those components do and how they do it—can and do influence the design activity. This influence might be felt as early as in the conception phase of a software project. For example, it might be known that a component provides a capability that would be difficult or costly to implement as a custom solution. In this situation the component capability could very well wind up as a requirement for the system. In other words, the component capability would effectively contribute to defining the scope of a system, and the act of defining this scope would effectively lead to a de facto component selection decision. The same is often true later in the design activity, and quite often (I would go so far as to say usually) this involves not just the characteristics of single components but ensembles of components acting in concert to provide some service.
It is irrelevant whether we view the component as having scoped the system or the system as having defined requirements that led to the selection of a component: in either case competency about component capabilities is essential. Where there are gaps in our competence, and competence gaps are inevitable as the number of commercial components used in a system increases, there is risk. As I discussed in the May COTS Spot, toys and model problems can be used to generate this competence on the fly. But where does the idea for a particular model problem come from? How do we know which model problems to solve and, having solved them, what to do with the solutions? Answering these questions is what the R3 process is all about. An outline of this process is depicted in Figure 1, which includes (naturally) three key steps:
- Realize a model solution. R3 begins with assumptions about system needs and a component ensemble that is believed to satisfy those needs. The designers sketch the workings of an ensemble, perhaps using component-interaction blackboards. Inevitably, questions will arise about how an ensemble works, or the manner in which the ensemble satisfies a need. If the need to be satisfied is critical, then it is essential to increase the level of understanding of the ensemble. The unknown property will itself suggest what kind of toy to build; previous design commitments (for example, component selections) will constrain how the toy is built, and the needs will define the evaluation criteria.
- Reflect on the qualities of the model solution. Model solutions are implementations that must be evaluated against criteria. Did the solution satisfy the criteria? Were additional evaluation criteria discovered that must be considered and, if so, how did the model solution stack up to these new criteria? (The discovery of new criteria usually heralds some sort of failure). Answering these questions may involve benchmarking, snooping, or other invasive "black box visibility" techniques. In any event, one of two possibilities arise from this reflection: the model solution passes muster, in which case it becomes part of the design baseline, or it fails in some way to satisfy the evaluation criteria. Failure is not necessarily fatal to an ensemble’s prospects.
- Repair the ensemble. It pays to be an optimist—or at least to be doggedly persistent—when developing COTS-based systems. Ensembles can be repaired by introducing new components, by using alternative components or component versions, by developing wrappers, or by any number of other strategies. Indeed, there are often several possible repair strategies for each deficiency detected. This has led us to develop evaluation techniques such as risk/misfit (the topic of a future column) to help structure component selection decisions that are dominated (or complicated) by the presence of multiple repair options. In any case, non-trivial repairs are hypotheses that must be tested, triggering a new iteration of R3.
Figure 1: The R3 Process for Design Risk Reduction
What happens if, despite all optimism and doggedness, an ensemble simply will not pass muster? In this case all is still not lost for the ensemble, but salvaging the situation may require a different repair strategy—one which involves altering the requirements that gave rise to the ensemble, rather than changing the ensemble itself. But before we tackle this issue (which will involve us with the theory of iterative development), a practical illustration of the R3 process from our own case book will be useful.
R3 in Action
Once upon a time my colleagues (Scott Hissam and Robert Seacord) and I were helping a Defense Department program migrate a large, custom legacy system to an open, COTS-based system. The program had made an early commitment to a Web-based solution, and was considering whether to repackage legacy services as CORBA-based services and provide thin-client access to these services via Java applets running in Web browsers. Even three years ago this ensemble was known to be feasible (Web browser and server, CORBA servers, Java applets). The real question, however, was whether this ensemble could, in a practical way, be made to work securely.
To support identification, authentication, authorization, and confidentiality, the ensemble illustrated in Figure 2 was proposed. (You can guess the age of the example from the component version numbers—they were current releases at the time.) The Web server would identify and authenticate (I&A) users accessing the system from their Web browsers. An authenticated user would receive a CORBA interoperable object reference (IOR) from the Web server. This CORBA object would contain the user’s permissions on the system for a particular session. Each operation performed by the user would be tagged with this session IOR, allowing back-end services to check a user’s authorization to invoke a service. Using CORBA interceptors would allow us to do this session-tagging transparently to applet developers. Last, all interactions between the requesting applet and back-end services would be transmitted using the CORBA IIOP protocol over a secure socket layer (SSL) connection, thus guaranteeing confidentiality.
Figure 2: A Secure Web Ensemble
Recall that model problems are toys with design context and evaluation criteria. In this ensemble the design context included all of the concrete component versions in the proposed ensemble. The evaluation criteria were simple. First, it had to work. That is, the sequence of steps numbered 1-5 should transpire once when an applet is downloaded to the browser and steps 6-8 whenever the applet requested a service. Second, we wanted the security services for authorization and confidentiality (all of the interactions among the shaded components) to use the same digital certificate information used by Netscape to do its identification and authentication in steps 1-2. This was important because we did not want to have duplicate security infrastructures, and we did not want to add any more client-side administrative burdens beyond that which was already required to use a browser with digital certificates.
What was the result? Problems arose when we tried to get the applet to establish an SSL connection for secure IIOP between the applet and storage manager. VisiBroker required the applet to supply a private key to configure the client side of the SSL connection. Although VisiBroker would accept the private key portion of the public/private key found in the user’s digital certificate—the same certificate used to authenticate the user in steps 1 and 2 of Figure 2—there was no way to programmatically retrieve this key from the Netscape browser. We suspected that this was because Netscape feared that if they did so they would be in violation of export control laws concerning key management technology. Regardless of the reason, the end result was that the ensemble provided all of the required security attributes except confidentiality.
Both of these repair options became design contingencies. Both of these contingencies required further investigation including, ultimately, the development of model problems to validate their feasibility. Readers interested in discovering the results of these investigations might wish to read Into the Black Box: A Case Study in Obtaining Visibility into Commercial Software and COTS in the Real World: A Case Study in Risk Discovery and Repair. But the real point of the illustration is not to provide the specific technical details of the model problems (most of which have been suppressed), but rather to reinforce the message that design risk reduction in COTS-based systems involves very detailed, implementation-level investigations. Any area of uncertainty in how a component or ensemble works to satisfy a key need is an area of risk that must be resolved as quickly as possible.
In this illustration we were able to expose the design risk and find a workaround based on our discovery about how Netscape manages its public and private keys. As I mentioned already, we had a fallback in case the ensemble would fail. Sometimes, however, a fallback is not available, especially if, as is frequently the case, system requirements were influenced by the perceived capabilities of an ensemble. What happens if these perceived capabilities turn out to fall short of the mark? This brings us to the topic of iterative development, and "wheels within wheels."
R3 and Iterative Development
Sometimes an ensemble will not work, no matter how hard you try to repair it, or just as likely, the repairs prove to be too expensive to justify. You can see from Figure 1 that the R3 process yields two results: a design (or, rather, the ensemble’s portion of a design), and the properties of that ensemble. If we assume that an ensemble cannot be repaired and there are no fallback positions, then there is only one thing left to do: reassess the original needs. In some cases, Mohammed must go to the mountain. But this involves us in another level of process iteration beyond that of R3. Thus, we assume to a large extent that R3 works within an iterative development process, and that it is, in effect, a small wheel within a larger wheel.
This should not be surprising, and I suspect that anyone who has read this far is already familiar with, if not comfortably experienced with, iterative software development processes. But be on guard: COTS software requires a form of process iteration that is unique to COTS, and is not to be found in the development of custom systems. To appreciate this dire warning we examine the currently popular Rational Unified Process (RUP), an iterative software development process championed by Rational Software. RUP is depicted in Figure 3. Please be aware that the discussion that follows is not a critique or criticism of RUP, but rather an elaboration of how a naïve interpretation of RUP can cause trouble in systems that are COTS-software intensive.
Figure 3: The Rational Unified Process
One of the key ideas in RUP is that an iterative development process is partitioned into four discrete phases: inception, elaboration, construction, and transition. This partitioning is quite useful because it answers one difficult question posed by iterative development: where are we going? Because each phase has its own milestones it is possible to focus each iteration in a constructive way. So, for example, the purpose of the inception phase is to produce the life-cycle objectives milestone, which describes the scope of a system, its business case, and how the most major risks were mitigated (among other things). The elaboration phase ends with a description of the system architecture and mitigation of second-tier risks.
Note, however, that in RUP the inception phase terminates with a definition of system scope, while the elaboration phase terminates with a definition of system architecture. But we know from our own experience that the selection of COTS software components and their assembly into ensembles—key architecture-defining activities in COTS-based systems—can greatly influence how we define the scope of a system. Indeed, in the R3 illustration we showed how an ensemble failure might require an adjustment of system requirements (i.e., system scope) in order to effect a repair.
When I mentioned that COTS requires a special form of iteration, this is what I was referring to. Put another way, decisions about system scope and component and ensemble selection are co-dependent: a designer must often make tentative decisions about system scope in light of what is thought to be known about COTS component capabilities, and then revisit those scope decisions as more is learned about the components. Conversely, component and ensemble selection decisions may be informed by system requirements, but those selection decisions may be revisited as more is learned about the requirements.
On the surface this co-dependence of component and ensemble selection decisions and system scope decisions seems to be a problem, because RUP allows no iteration among life-cycle phases—that is, elaboration begins with a defined system scope. However, the folks at Rational Software are smart, and the ideas underlying RUP are sufficiently sound and flexible to accommodate the development of systems that are COTS-software intensive. There are at least two ways to do this:
- You will observe that in Figure 3 the inception phase includes some design and implementation effort. This effort refers to risk-reduction prototyping as needed to scope the system and identify and mitigate the most serious risks. Given this, it might be useful to concentrate prototyping efforts in the inception phase to support component and ensemble selection decisions as a necessary adjunct to defining life-cycle objectives.
- Requirements and analysis work continues (and in fact accelerates) in the elaboration phase, even after the basic scope of the system has been defined. Given this, it may be possible to consider the life-cycle objectives as a flexible rather than rigid milestone. This would allow the discovery of COTS component capabilities and liabilities to change the scope of a system, even if the life-cycle objectives milestone has been satisfied.
Option 1 is fine for projects that do not use state-of-the-art COTS components. Projects that use cutting-edge components must make a conscious tradeoff between advanced capabilities and component stability. In these situations the designer may be required to manage multiple design contingencies (system scope and architecture) far into the development process. Option 2 is fine if the stakeholders are willing to be flexible about their requirements and can tolerate some degree of uncertainty about some very fundamental decisions—what the system will do and which COTS components will be used to do it.
The best solution may be to combine elements of both options. That is, do as much risk reduction as possible during system inception, where more stable COTS components are used, and then work with the stakeholders to manage their expectations about system capabilities that depend upon more cutting-edge but unstable components during system elaboration. In either case the iterative R3 process for obtaining component competence and proving the feasibility of component ensembles will be useful in identifying and resolving risks related to the use of COTS software.
I’ve mentioned several times that the designer may need to manage multiple design contingencies—specifically, contingencies that reflect ensemble repair options. This presents the designer with a variety of new challenges. What is the design of a system at any point in time if there are multiple contingencies being explored? How much should a project invest in the "just-in-time competency" attending the exploration of any particular contingency? And, at what point should this investment be terminated and a contingency foreclosed? I will take up these issues of "contingency management" in the next issue of The COTS Spot.
About the Author
Kurt Wallnau is a senior member of the technical staff in the Dynamic Systems Program at the SEI, where he is co-lead of the COTS-Based Systems Initiative. Before that he was a project member in the SEI Computer-Aided Software Engineering (CASE) integration project. Prior to coming to the SEI, Wallnau was the Lockheed Martin system architect for the Air Force CARDS program, a project focused on "reusing" COTS software and standard architectures.
The views expressed in this article are the author's only and do not represent directly or imply any official position or view of the Software Engineering Institute or Carnegie Mellon University. This article is intended to stimulate further discussion about this topic.