NEWS AT SEI
This article was originally published in News at SEI on: March 1, 2003
Recently, I attended a tutorial on software evolution by Emeritus Professor Manny Lehman from Middlesex University. While the overall tutorial was very valuable to our work in COTS-based systems and legacy system modernization, Lehman’s remark that most of the defects discovered in existing systems were probably caused by invalid or changed assumptions struck a chord. When I was a programmer at the Data Systems Division of IBM, we always included a section called “Assumptions” in our design documentation. Writing this section was always reason for reflection. What were the assumptions upon which the design was based? And what would happen if these assumptions were wrong, or were changed? These design documents were inevitably scrutinized by an inspection team, and the assumptions would be reviewed, discussed, and possibly refined. Although we took some time to consider assumptions at this point in the process, there was no further formal consideration of assumptions through the remainder of the process. This does not mean that assumptions no longer play a role in the construction and test of the software.
Of course, assumptions have a much longer history. Parnas [Parnas71] characterized interfaces as the assumptions that elements could make about each other, and most of his software engineering contributions involve that observation one way or another. Garlan’s architectural mismatch paper [Garlan 95] essentially rediscovered this, but nevertheless brought assumptions into the modern vocabulary.
Why Do We Have Assumptions?
For most large information systems the operational domain is essentially unbounded. No matter how many observations or properties are identified and associated with the domain, it is always possible to add more. The software, on the other hand, having human creators, is finite. The software system is, therefore, intrinsically incomplete. The resultant gap between the system and its operational domain is bridged by assumptions, explicit and implicit [Lehman 00]. These assumptions fill in the gaps between the system and the documented and validated requirements of the operational domain.
Additionally, the real-world domain and the application itself are always changing. Even supposing that the initial assumption set was valid, individual assumptions will, as time goes on, become invalid with unpredictable results or, at best, with operation that is not totally satisfactory [Parnas 94].
Requirements and Assumptions
An argument can be made that nothing in a software system should be assumed, and that everything should be stated as requirements. Even if this were the case, it would certainly be true that the requirements were assumed to be valid.
In reality, requirements are only a top-level statement of need, and the road between requirements and code is paved with assumptions. These assumptions can be validated and verified along the way, but because we must always deal with some degree of ambiguity, our confidence in the validity of the assumptions may vary considerably.
In one way or another, assumptions are reflected in the software. In my experience at IBM, assumptions were tracked, recorded during design, and reviewed by an inspection team. The inspection team could evaluate assumptions in each design to make sure that they were valid and consistent with system-level assumptions. Unfortunately, the assumption management process typically stopped at this point, because no model or infrastructure was in place to support the tracking of assumptions through implementation and test.
As a programmer of many years, I believe that any programmer would agree that assumptions are an inherent part of software implementation. Every time a decision is made—about how to design an interface, how to implement an algorithm, if and how to encapsulate an external dependency—assumptions are made concerning how the software will be used, how it will evolve, and what environments it will operate in. The unfortunate aspect of software implementation today is that these assumptions are seldom if ever recorded, although they are instrumental in determining the form the software product takes. Also, because these assumptions are not recorded, they are seldom communicated or reviewed. As a result, some assumptions may be incompatible with assumptions made elsewhere in the code, or incompatible with design- or system-level assumptions. These incompatibilities may lead to the insertion of defects or, even worse, post-deployment failures. Furthermore, as was already pointed out, it is likely that individual assumptions will become invalid as a result of changes in the operational domain over the life of the system.
When performing change analysis to determine which changes to accept and which to reject, the configuration control board has no way of knowing which assumptions are built into the software. For example, there may be a number of complex modules that assume a particular hardware configuration. A configuration control board may approve a change, not understanding that this change invalidates these embedded assumptions. The effort to implement the change may result in a major rewrite of significant portions of the system. The least that can be said here is that a lack of assumption management certainly does not lend itself to predictable schedule and costs.
Is there a feasible solution to the problem of assumption management? I believe that a solution may in fact exist, although it will require some additional infrastructure and a slight culture shift.
From an infrastructure perspective, programmers are unlikely to manage assumptions independent of source code. The failure of programmers to keep design and other documentation consistent with evolving source code is an established and well-known phenomenon. Sun Microsystems has developed a rather ingenious solution to this problem. One of the unique features of Java is that it supports embedded documentation comments, which are used to generate the API documentation. While parsing the source code to create class files (object files), the compiler converts the declarations and doc comments into HTML documentation[Friendly 95]. This is a user-friendly mechanism for programmers to update documentation by updating the structured comments within their source code. This process can be easily performed in manner that is not disruptive to the coding process.
Perhaps even more closely related to assumption management is the programmatic use of assertions. At its most basic form, an assertion is simply a procedure that takes a boolean parameter and reports to the programmer if the boolean is false [Lewis 97]. Assertions are a form of assumption management, where the assumption can be checked at runtime. Exceptions reflect another kind of assumption, which are used in many modern programming languages, including C++, Java, and Eiffel.
Assumption management is a little bit like assertions on steroids. Beyond simple boolean conditions that can be evaluated by a compiler at runtime, assumption management allows programmers to record a vast range of assumptions in structured English. These assumptions can be easily recorded as part of the implementation, and then extracted from the source code using a pre-processor or aspect-oriented programming language [Elrad 01]. Recording assumptions in source code alone might prove invaluable, but extracting them into a searchable repository should allow system architects and lead designers to more easily review the assumptions of individual programmers to determine if they are consistent with design and system assumptions. These databases can later be used by configuration control boards in change analysis to more accurately determine the impact of a proposed change.
Assumption management entails a slight shift in software development culture in that the assumptions that are being made as part of the development process must also be recorded in source code and other software artifacts. However, the potential benefit resulting from this practice may be tremendous in the earlier identification and elimination of defects, and in improved change analysis for more predictable and cost-effective software evolution.
Lehman, M. M. & Ramil, J. F. “Software Evolution in the Age of Component Based Software Engineering,”249–255. IEE Proceedings Software, 2000, Vol. 147, No. 6, December 2000.
Elrad, Tzilla; Filman, Robert E.; & Bader, Atef. “Aspect-Oriented Programming.” Communications of the ACM 44, 10 (October 2001).
Friendly, Lisa. “The Design of Distributed Hyperlinked Programming Documentation.” Proceedings of the International Workshop on Hypermedia Design '95 (IWHD '95).
Garlan, David; Allen, Robert; Ockerbloom, John,Architectural Mismatch: or Why It's Hard to Build Systems Out of Existing Parts, Proceedings of the International Conference on Software Engineering, Seattle, 1995.
Parnas, D. “Information Distribution Aspects of Design Methodology.” Proceedings 1971 IFIP Congress, North Holland Publishing Company.
Parnas, D.L., "Software Aging" in Proceedings of the 16th International Conference on Software Engineering, Sorento Italy, IEEE Press, 279-287, May 16-21/94.
Seacord, Robert; Plakosh, Daniel; & Lewis, Grace. Modernizing Legacy Systems: Software Technologies, Engineering Processes and Business Practices. New York, NY: Addison-Wesley, 2003.
Lewis, Peter N. “Using Assert()” MacTech Magazine 13, 12 (1997).
Wallnau, Kurt; Hissam, Scott; & Seacord, Robert. Building Systems from Commercial Components. New York, NY: Addison-Wesley, 2001.
Lehman, M. M. & Belady, L.A. Program Evolution: Processes of Software Change. London: Academic Press, 1985.
About the Author
Robert C. Seacord is a senior member of the technical staff at the SEI and an eclectic technologist. He is coauthor of the book Building Systems from Commercial Components as well as more than 40 papers on component-based software engineering, Web-based system design, legacy system modernization, component repositories, search engines, security, and user interface design and development.
The views expressed in this article are the author's only and do not represent directly or imply any official position or view of the Software Engineering Institute or Carnegie Mellon University. This article is intended to stimulate further discussion about this topic.