|
Complete
Quantitative measurement of an operational system's
maintainability
is desirable both as an instantaneous measure and as a predictor of
maintainability over time. Efforts to measure and track
maintainability are intended to help reduce or reverse a system's
tendency toward "code entropy" or degraded integrity, and to indicate
when it becomes cheaper and/or less risky to rewrite the code than to
change it. Software Maintainability Metrics Models in
Practice is the latest report from an ongoing, multi-year joint
effort (involving the Software Engineering Test Laboratory of the
University of Idaho, the Idaho National Engineering Laboratory,
Hewlett-Packard, and other companies) to quantify maintainability via
a Maintainability Index (MI) [Welker
95]. Measurement and use of the MI is a process technology,
facilitated by simple tools, that in implementation becomes part of
the overall development or maintenance process. These efforts also
indicate that MI measurement applied during software development can
help reduce lifecycle costs. The developer can track and control the
MI of code as it is developed, and then supply the measurement as
part of code delivery to aid in the transition to maintenance.
Other studies to define code maintainability in various
environments have been done [Peercy
81, Bennett
93], but the set of reports leading to the MI measurement
technique offered by Welker [Welker
95] describes a method that appears to be very applicable to
today's Department of Defense (DoD) systems.
The literature of at least the last ten years shows that there
have been several efforts to characterize and quantify software
maintainability; Maintenance of
Operational Systems--An Overview provides a broad overview of
software maintenance issues. In this specific technology, a program's
maintainability is calculated using a combination of widely-used and
commonly-available measures to form a Maintainability Index (MI). The
basic MI of a set of programs is a polynomial of the following form
(all are based on average-per-code-module measurement):
171 - 5.2 * ln(aveV) - 0.23 * aveV(g') - 16.2 * ln (aveLOC) + 50 *
sin (sqrt(2.4 * perCM))
The coefficients are derived from actual usage (see Usage
Considerations). The terms are defined as follows:
aveV = average Halstead Volume V per module (see
Halstead Complexity Measures)
aveV(g') = average extended cyclomatic complexity per module
(see Cyclomatic
Complexity)
aveLOC = the average count of lines of code (LOC) per module; and,
optionally
perCM = average percent of lines of comments per module
Oman develops the MI equation forms and their rationale
[Oman
92a]; the Oman study indicates that the above metrics are
good and sufficient predictors of maintainability. Oman builds
further on this work using a modification of the MI and describing
how it was calibrated for a specific large suite of industrial-use
operational code [Oman
94]. Oman describes a prototype tool that was developed
specifically to support capture and use of maintainability measures
for Pascal and C [Oman
91]. The aggregate strength of this work and the underlying
simplicity of the concept make the MI technique potentially very
useful for operational Department of Defense (DoD) systems.
Calibration of the equations. The coefficients
shown in the equation are the result of calibration using data from
numerous software systems being maintained by Hewlett-Packard.
Detailed descriptions of how the MI equation was calibrated and used
appear in Coleman, Pearse, and Welker [Coleman
94, Coleman,
95, Pearse
95, Welker
95]. The authors claim that follow-on efforts show that this
form of the MI equation generally fits other industrial-sized
software systems [Oman
94 and Welker
95], and the breadth of the work tends to support this claim.
It is advisable to test the coefficients for proper fit with each
major system to which the MI is applied.
Effects from comments in code. The user must
analyze comment content and quality in the specific system to decide
whether the comment term perCM is useful.
Ways of using MI
- The system can be checked periodically for maintainability,
which is also a way of calibrating the equations.
- It can be integrated into a development effort to screen code
quality as it is being built and modified; this could yield
potentially significant life cycle cost savings.
- It can be used to drive maintenance activities by evaluating
modules either selectively or globally to find high-risk
code.
- MI can be used to compare or evaluate systems: Comparing the
MIs of a known-quality system and a third-party system can provide
key information in a make-or-buy decision.
Example of usage. Welker relates how a module
containing a routine with some "very ugly" code was assessed as
unmaintainable, when expressed in terms of the MI (note that just
quantifying the problem is a step forward) [Welker
95]. The module was first redesigned, and then functionally
enhanced. The measured results are shown in Table
7:
Table 7: Measured Results
|
Measure
|
Initial Code
|
Restructured Code
|
After Enhancement
|
|
Code Unit
|
Routine
|
Module
|
Routine
|
Module
|
Routine
|
Module
|
|
MI (larger MI = more maintainable)
|
6.47
|
33.55
|
39.93
|
70.13
|
37.62
|
69.60
|
|
Halstead Effort1
|
2,216,499
|
2,233,072
|
182,216
|
480,261
|
201,429
|
499,474
|
|
Extended Cyclomatic Complexity2
|
45
|
49
|
18
|
64
|
21
|
67
|
|
Lines of Code
|
622
|
663
|
196
|
732
|
212
|
748
|
1 Halstead Effort,
rather than Halstead Volume, was used in this case study. See
Halstead
Complexity Measures for more information
on both these measures. Generally, the lower a program's measure of
effort, the simpler a change to the program will be (because Halstead
measures are weighted toward measuring computational complexity, not
all programs will behave this way).
2 Note that a low
Cyclomatic
Complexity is generally indicative of a
lower risk, hence more maintainable, program. In this case,
restructuring increased the module complexity slightly (from 49 to
64), but reduced the "ugly" routine's complexity significantly. In
both, the subsequent enhancement drove the complexity slightly
higher.
If the enhancement had been made without first doing the
restructuring, these figures indicate the change would have been much
more risky.
Coleman, Pearse, and Welker provide detailed descriptions of how
MI was calibrated and used at Hewlett-Packard [Coleman
94, Coleman
95, Pearse
95, Welker
95].
Oman tested the MI approach by using production operational code
containing around 50 KLOC to determine the metric parameters, and by
checking the results against subjective data gathered using the 1989
AFOTEC maintainability evaluation questionnaire [AFOTEC
89, Oman
94]. Other production code of about half that size was used
to check the results, with apparent consistency.
Welker applied the results to analyses of a US Air Force (USAF)
system, the Improved Many-On-Many (IMOM) electronic combat modeling
system. The original IMOM (in FORTRAN) was translated to C and the C
version was later reengineered into Ada. The maintainability of both
newer versions was measured over time using the MI approach
[Welker
95]. Results were as follows:
- The reengineered version's MI was more than twice as
high as the original code (larger MI = more maintainable), and
declined only slightly over time (note that the original code was
not measured over time for maintainability, so change in its MI
could not be measured).
- The translated baseline's MI was not
significantly different from the original. This is of special
interest to those considering translation,
because one of the primary objectives of translation is to reduce
future maintenance costs. There was also evidence that the MI of
translated code deteriorates more quickly than reengineered
code.
Calculating the MI is generally simple and straightforward, given
that several commercially-available programming environments contain
utilities to count code lines, comment lines, and even Cyclomatic
Complexity. Other than the tool described in Oman [Oman
91], tools to calculate Halstead
Complexity Measures are less common because the measure is not
used as widely. However, once conventions for the counting have been
established, it is generally not difficult to write language-specific
code scanners to count the Halstead components (operators and
operands) and calculate the E and V measures. In relating that
removal of unused code in a single module did not affect the MI,
Pearse highlights the fact that MI is a system measurement; its
parameters are average values [Pearse
95]. However, measuring the MI of individual modules is
useful because changes in either structural or computational
complexity are reflected in a module's MI. A product/process
measurement program not already gathering the metrics used in MI
could find them useful additions. Those metrics already being
gathered may be useful in constructing a custom MI for the system.
However, it would be advisable to consult the references for their
findings on the effectiveness of metrics, other than Halstead E and V
and cyclomatic complexity, in determining maintainability.
The MI method depends on the use of Cyclomatic
Complexity and Halstead
Complexity Measures. To realize the full benefit of MI, the
maintenance environment must allow the rewriting of a module when it
becomes measurably unmaintainable. The point of measuring the MI is
to identify risk; when unacceptably risky code is identified, it
should be rewritten.
The process described by Sittenauer is designed to assist in
deciding whether or not to reengineer a system [Sittenauer
92]. There are also many research and analytic efforts that
deal with maintainability as a function of program structure, design,
and content, but none was found that was as clearly appropriate as MI
to current DoD systems in the lifecycle phases described in Maintenance
of Operational Systems--An Overview.
The test in Sittenauer is meant to verify generally the condition
of a system, and would be useful as a periodic check of a software
system and to compare to the MI [Sittenauer
92].
This technology is classified under the following categories.
Select a category for a list of related topics.
|
[AFOTEC 89]
|
Software Maintainability Evaluation Guide 800-2,
Volume 3. Kirtland AFB, NM: HQ Air Force Operational Test
and Evaluation Center (AFOTEC), 1989.
|
|
[Ash 94]
|
Ash, Dan, et al. "Using Software Maintainability Models
to Track Code Health," 154-160. Proceedings of the
International Conference on Software Maintenance.
Victoria, BC, Canada, September 19-23, 1994. Los Alamitos,
CA: IEEE Computer Society Press, 1994.
|
|
[Bennett 93]
|
Bennett, Brad & Satterthwaite, Paul. "A
Maintainability Measure of Embedded Software," 560-565.
Proceedings of the IEEE 1993 National Aerospace and
Electronics Conference. Dayton, OH, May 24-28, 1993.
New York, NY: IEEE, 1993.
|
|
[Coleman 94]
|
Coleman, Don, et al. "Using Metrics to Evaluate Software
System Maintainability." Computer 27, 8 (August
1994): 44-49.
|
|
[Coleman 95]
|
Coleman, Don; Lowther, Bruce; & Oman, Paul. "The
Application of Software Maintainability Models in Industrial
Software Systems." Journal of Systems Software 29,
1 (April 1995): 3-16.
|
|
[Oman 91]
|
Oman, P. HP-MAS: A Tool for Software Maintainability,
Software Engineering (#91-08-TR). Moscow, ID: Test
Laboratory, University of Idaho, 1991.
|
|
[Oman 92a]
|
Oman, P. & Hagemeister, J. Construction and
Validation of Polynomials for Predicting Software
Maintainability (92-01TR). Moscow, ID: Software
Engineering Test Lab, University of Idaho, 1992.
|
|
[Oman 92b]
|
Oman, P. & Hagemeister, J. "Metrics for Assessing a
Software System's Maintainability," 337-344. Conference
on Software Maintenance 1992. Orlando, FL, November
9-12, 1992. Los Alamitos, CA: IEEE Computer Society Press,
1992.
|
|
[Oman 94]
|
Oman, P. & Hagemeister, J. "Constructing and Testing
of Polynomials Predicting Software Maintainability."
Journal of Systems and Software 24, 3 (March 1994):
251-266.
|
|
[Pearse 95]
|
Pearse, Troy & Oman, Paul. "Maintainability
Measurements on Industrial Source Code Maintenance
Activities," 295-303. Proceedings. of the International
Conference on Software Maintenance. Opio, France,
October 17-20, 1995. Los Alamitos, CA: IEEE Computer Society
Press, 1995.
|
|
[Peercy 81]
|
Peercy, David E. "A Software Maintainability Evaluation
Methodology." Transactions on Software Engineering
7, 7 (July 1981): 343-351.
|
|
[Sittenauer 92]
|
Sittenauer, Chris & Olsem, Mike. "Time to
Reengineer?" Crosstalk, Journal of Defense Software
Engineering 32 (March 1992): 7-10.
|
|
[Welker 95]
|
Welker, Kurt D. & Oman, Paul W. "Software
Maintainability Metrics Models in Practice." Crosstalk,
Journal of Defense Software Engineering 8, 11
(November/December 1995): 19-23.
|
|
[Zhuo 93]
|
Zhuo, Fang, et al. "Constructing and Testing Software
Maintainability Assessment Models," 61-70. Proceedings
of the First International Software Metrics Symposium.
Baltimore, MD, May 21-22, 1993. Los Alamitos, CA: IEEE
Computer Society Press, 1993.
|
Edmond VanDoren, Kaman Sciences, Colorado Springs
Paul W. Oman, Ph.D., Computer Science Department, University of
Idaho, Moscow, ID
Kurt Welker, Lockheed Martin, Idaho Falls, ID
10 Jan 97 (original)
12 Mar 02 Correction of Maintainability Index (MI) formula
The Software
Engineering Institute (SEI) is a federally funded research and
development center sponsored by the U.S. Department of Defense
and operated by Carnegie Mellon University.
Copyright
2007
by Carnegie Mellon University
Terms of Use
URL: http://www.sei.cmu.edu/str/descriptions/mitmpm_body.html
Last Modified: 11 January 2007
|