General Navigation Buttons - Home | Search | Contact Us | Site Map | Whats New
products graphic
white space
products
Software Technology Roadmap
What's New
Background & Overview
Technology Descriptions
Defining Software Technology
Technology Categories
Template for Technology Descriptions
Taxonomies
Glossary & Indexes
Feedback & Participation
Software Engineering Information Repository (SEIR)
white space
About SEI|Mgt|Eng|Acq|Collaboration|Prod.& Services|Pubs
pixel
Rollover Popup Hints for Topic Navigation Buttons above
pixel
Algorithm Formalization


Status

Advanced

Purpose and Origin

In an effort to better understand computer algorithms, researchers in this area began to formally characterize the properties of various classes of algorithms. Initially, research centered on divide-and-conquer and global search algorithms. This initial research proved that these formal algorithm characterizations, called algorithm theories, could be used to synthesize implementations (code) for well-defined functions. Used in program generation or synthesis systems, the purpose of algorithm formalization is two-fold:

  • The synthesis of consistent, highly CPU efficient algorithms for well-defined functions.
  • The formal characterization of algorithm theory notions [Smith 93b]. A by-product of this formalization is the creation of a taxonomy of algorithm theories in which relationships between algorithm theories are formally characterized. These formal characterizations allow a developer to exploit more effectively the structure inherent in the problem space, and thereby allow him to derive or synthesize more efficient implementations.
To synthesize an algorithm for a problem using this technology, the essence of the problem and its associated problem domain must be captured in a collection of formal specifications.

Technical Detail

Algorithm synthesis is an emerging correct-by-construction methodology in which algorithm theories are refined to satisfy the constraints represented in an algebraic specification of the problem space [Smith 90]. These algorithm theories represent the structure common to a class of algorithms and abstract out concerns about the specific problem to be solved, the control strategy, the target language and style (e.g., functional versus imperative), and the target architecture. Because theorem provers are used to refine the algorithm theories, the resulting synthesized algorithm is guaranteed to be consistent with the problem specification. In other words, the synthesized algorithm is guaranteed to find solutions to the specified problem provided such solutions exist. If multiple solutions are possible, an algorithm can be synthesized to return one, some, or all of them.

Synthesis systems incorporating formal algorithm theories operate as follows. The developer supplies a formal specification of the problem for which an algorithm is needed, and supplies formal specifications for the operations referenced in the problem specification (i.e., the domain theory). These specifications must be in a prescribed format and language. Using syntactic information drawn from the problem specification, the synthesis system selects candidate algorithm theories from a library of such theories. The developer selects one of these for refinement. The synthesis system then uses the semantic information provided by the problem and domain theories and- using a theorem prover- completes the refinement process. After the algorithm is generated, a developer will typically apply several computer assisted optimizations to the algorithm before compilation.

Coupling a theorem prover to the algorithm synthesis environment enables computer management of the inherent complexity of the problem and solution spaces, permitting computer management of complex code optimizations. For example, a synthesized algorithm (or implementation) is modified by a user-requested optimization only if the theorem prover is able to verify the consistency of the resulting code. For example, simplification of conditionals, function unfolding (inline expansion), and finite differencing are all possible.

Usage Considerations

The use of this technology encourages reuse of codified knowledge. Specifically, once a domain theory has been developed, it can be used to help define additional problem specifications within that domain, or it can be combined with other domain theories to characterize larger domains. Note, however, that the characterization of large and/or complex domains is non-trivial and may take considerable effort. With respect to the synthesis system itself, a developer is free to add additional algorithm theories to its library. However, the development of such algorithm theories is complex and will require in-depth knowledge of that class of algorithm.

Synthesizing algorithms from formal specifications involves a paradigm shift from traditional programming practice. Because formal specifications are used, developers must formally characterize what the operations in the problem domain do rather than stating how they do it. In addition, maintenance is not performed on the synthesized code. Instead, the problem specification is modified to reflect the new requirement(s), and an implementation is rederived.

Synthesis of algorithms from formal specifications is independent of the target programming language. However, the synthesis environments themselves may need to be modified to support particular target languages, or code translators may be needed to translate the code generated by the synthesis environment to the desired target language.

Algorithms for non-real time, well-defined deterministic functions- such as sorting or complex scheduling- can be synthesized using this technology. However, additional work is required to determine whether this technology can be extended with notions state and nondeterminism.

Maturity

This technique, along with an algorithm synthesis prototype environment called Kestrel Interactive Development System (KIDS), was developed around 1986 [Smith 86, Smith 91]. Although it initially supported divide-and-conquer and global search algorithm theories, KIDS has been extended with more powerful algorithm theories and with more sophisticated constraint propagation mechanisms. KIDS has been used to synthesize a transportation scheduling algorithm used by US Transportation Command; this scheduling algorithm is able to schedule 10,000 movement requests in approximately one minute, versus hours for competitive scheduling algorithms [Smith 93c]. Ongoing research in this area includes a formalization of local search and formalizations of complex scheduling algorithms. Proof-of-concept scheduling algorithms have been synthesized for the nuclear power-plant domain in which

  • scheduled activities can have complex interactions
  • timing constraints are represented by earliest start/finish times
This technology is also being extended to address the synthesis of parallel algorithms [Smith 93b].

Costs and Limitations

Like all software development efforts, specification inconsistency may result in implementations that do not meet users' needs. However, the formal nature of problem specifications permits semi-automated investigation of problem specification properties. Adaptation of this technology requires knowledge of discrete mathematics at the level of first order logic and experience in developing formal specifications. Knowledge of constraint propagation, category theory, and resolution-based theorem proving is also required. In addition, formalization of various problem domains may be difficult; to effectively use this technology, special training may be required. However, there are currently no commercially-available, regularly-scheduled courses offered on this subject.

Dependencies

Constraint propagation, resolution-based theorem proving, finite differencing technology (used in verifiably correct optimizations), algebraic specification techniques, and specification construction techniques are enablers for this technology.

Alternatives

Other approaches to developing demonstrably correct algorithm implementations are based on formal verification or deductive synthesis. Software generation systems can be used to select and specialize an algorithm implementation from a library of implementations, to assemble an algorithm for a collection of reusable code fragments, or to generate algorithm implementation stubs (i.e., they can generate code for some parts of an algorithm using syntactic rather than semantic information), but generally such implementations are not be guaranteed to be consistent with the problem specification.

Complementary Technologies

Category-theoretic specification construction methodologies are useful for developing and refining algorithm, domain, and problem theories. In addition, various domain analysis technologies can be used to investigate the structure of the problem domain.

Index Categories

This technology is classified under the following categories. Select a category for a list of related topics.

Name of technology

Algorithm Formalization

Application category

Select or Develop Algorithms (AP.1.3.4)

Quality measures category

Consistency (QM.1.3.2),
Provably Correct (QM.1.3.4),
Throughput (QM.2.2.3)

Computing reviews category Algorithms (I.1.2),
Automatic Programming (D.1.2),
Numerical Algorithms and Problems (F.2.1),
Nonumerical Algorithms and Problems (F.2.2),
Specifying and Verifying and Reasoning about Programs (F.3.1)

References and Information Sources

[Gomes 96] Gomes, Carla P.; Smith, Douglas; & Westfold, Stephen. "Synthesis of Schedulers for Planned Shutdowns of Power Plants," 12-20. Proceedings of the Eleventh Knowledge-Based Software Engineering Conference. Syracuse, NY, September 25-28, 1996. Los Alamitos, CA: IEEE Computer Society Press, 1996.
[Smith 86] Smith, Douglas R. "Top-Down Synthesis of Divide-and-Conquer Algorithms," 35-61. Readings in Artificial Intelligence and Software Engineering. Palo Alto, CA: Morgan Kaufmann, 1986.
[Smith 90] Smith, Douglas R. & Lowry, Michael R. "Algorithm Theories and Design Tactics." Science of Computer Programming 14, 2-3 (1990): 305-321.
[Smith 91] Smith, Douglas R. "KIDS-A Knowledge-Based Software Development System," 483-514. Automating Software Design. Menlo Park, CA: AAAI Press, 1991.
[Smith 93a] Smith, Douglas R. Classification Approach to Design (KES.U.93.4). Palo Alto, CA: Kestrel Institute, 1993.
[Smith 93b] Smith, Douglas R. "Derivation of Parallel Sorting Algorithms," 55-69. Parallel Algorithm Derivation and Program Transformation. New York, NY: Kluwer Academic Publishers, 1993.
[Smith 93c] Smith, Douglas R. "Transformational Approach to Transportation Scheduling," 60-68. Proceedings of the Eighth Knowledge-Based Software Engineering Conference. Chicago, IL, September 20-23, 1993. Los Alamitos, CA: IEEE Computer Society Press, 1993.

Current Author/Maintainer

Mark Gerken, Air Force Rome Laboratory

Modifications

10 Jan 97 (original)


The Software Engineering Institute (SEI) is a federally funded research and development center sponsored by the U.S. Department of Defense and operated by Carnegie Mellon University.

Copyright 2007 by Carnegie Mellon University
Terms of Use
URL: http://www.sei.cmu.edu/str/descriptions/algorithm_body.html
Last Modified: 11 January 2007