General Navigation Buttons - Home | Search | Contact Us | Site Map | Whats New
products graphic
white space
products
Software Technology Roadmap
What's New
Background & Overview
Technology Descriptions
Defining Software Technology
Technology Categories
Template for Technology Descriptions
Taxonomies
Glossary & Indexes
Feedback & Participation
Software Engineering Information Repository (SEIR)
white space
About SEI|Mgt|Eng|Acq|Collaboration|Prod.& Services|Pubs
pixel
Rollover Popup Hints for Topic Navigation Buttons above
pixel
Statistical Process Control for Software


Status

Complete

Purpose and Origin

The demand for increased efficiency and effectiveness of our software processes places measurement demands on the software engineering community beyond those traditionally practiced. Statistical and process thinking principles lead to the use of statistical process control methods to determine the consistency and capability of the many processes used to develop software.

Technical Detail

Over the past decade, the concepts, methods, and practices associated with process management and continual improvement have gained wide acceptance in the software community. These concepts, methods, and practices embody a way of thinking, a way of acting, and a way of understanding the data generated by processes that collectively result in improved quality, increased productivity, and competitive products. The acceptance of this "process thinking" approach has motivated many to start measuring software processes that are responsive to questions relating to process performance [Florac 99]. In that vein, traditional software measurement and analysis methods of measuring "planned versus actual" is not sufficient for measuring process performance or for predicting process performance. The time has come to marry, if you will, "process thinking" with "statistical thinking."

"Statistical thinking" [Britz 97] embraces three principles

  1. all work occurs in a system of interconnected processes
  2. variation exists in all processes
  3. understanding and reducing variation are keys to success

If we examine the basis for these "process thinking" and "statistical concepts", we find that they are founded on the principles of statistical process control. These principles hold that by establishing and sustaining stable levels of variability, processes will yield predictable results. We can then say that the processes are under statistical control. Controlled processes are stable processes, and stable processes enable you to predict results. This in turn enables you to prepare achievable plans, meet cost estimates and scheduling commitments, and deliver required product functionality and quality with acceptable and reasonable consistency. If a controlled process is not capable of meeting customer requirements or other business objectives, the process must be improved or retargeted.

When we relate these notions of process and statistical thinking to the operational level, we realize a key concern of process management is that of process performance &endash; how is the process performing now (effectiveness, efficiency), and how can it be expected to perform in the future? In the context of obtaining quantified answers to these questions, we can address this issue by deconstructing the question of process performance into three parts.

First we should examine process performance in terms of compliance. For example, is the process being executed properly? Is the personnel trained? Are the right tools available? If the process is not in compliance, we know there is little chance of it performing satisfactorily.

If a process is compliant, the next question is: Is the process performance (execution) reasonably consistent over time? Is the effort, cost, elapsed time, delivery, and quality consumed and produced by executing the process consistently? Realizing that variation exists in all processes, is the variation in process performance predictable?

Finally, if the process performance is consistent, we ask the question: Is the process performing satisfactorily? Is it meeting the needs of interdependent processes and/or of the needs of the customers? Is it effective and efficient?

Historically, software organizations have addressed the question of compliance by conducting assessments, such as comparing the organizations' software process against a standard (e.g., the CMM). Such an assessment provides a picture of the process status at a point in time and indicates the organization's capacity to execute various software processes according to the standard's criteria. However, it does not follow that the process is executed consistently or efficiently merely because the assessment results satisfied all the criteria.

The questions of process consistency, effectiveness, and efficiency require a measurement of process behavior as it is executed over time. Other disciplines have addressed this issue by using statistical process control methods, specifically using Shewhart control charts. They have concluded that control charts provide the basis for making process decisions and predicting process behavior.

Successful use of control charts by other disciplines suggest it is time to examine how statistical process control techniques can help to address our software process issues. In so doing, we find that Shewhart's control charts provide a statistical method for distinguishing between variation caused by normal process operation and variation caused by anomalies in the process. Additionally, Shewhart's control charts provide an operational definition for determining process stability or consistency and predictability as well as quantitatively establishing process capability to meet criteria for process effectiveness and efficiency.

We use the term software process to refer not just to an organization's overall software process, but to any process or subprocess used by a software project or organization. In fact, a good case can be made that it is only at subprocess levels that true process management and improvement can take place. Thus, we view the concept of software process as applying to any identifiable activity that is undertaken to produce or support a software product or service. This includes planning, estimating, designing, coding, testing, inspecting, reviewing, measuring, and controlling, as well as the subtasks and activities that comprise these undertakings.

Process Performance Variation

The basis for control charts is recognition of two types of variation: common cause variation and assignable cause variation.

Common cause variation is variation in process performance due to normal or inherent interaction among the process components (people, machines, material, environment, and methods). Common cause variation of process performance is characterized by a stable and consistent pattern over time, as illustrated in Figure 1. Variation in process performance due to common cause is thus random, but will vary within predictable bounds. When a process is stable, the random variations that we see all come from a constant system of chance causes. The variation in process performance is predictable, and unexpected results are extremely rare.


Figure 1: The Concept of Controlled Variation

The key word in the paragraph above is "predictable." Predictable is synonymous with "in control."

The other type of variation in process performance is due to assignable causes. Assignable cause variation has marked impacts on product characteristics and other measures of process performance. These impacts create significant changes in the patterns of variation. This is illustrated in Figure 2, which we have adapted from Wheeler and Chambers [Wheeler 92]. Assignable cause variations arise from events that are not part of the normal process. They represent sudden or persistent abnormal changes to one or more of the process components. These changes can be in things such as inputs to the process, the environment, the process steps themselves, or the way in which the process steps are executed. Examples of assignable causes of variation include shifts in the quality of raw materials, inadequately trained people, changes to work environments, tool failures, altered methods, failures to follow the process, and so forth.


Figure 2: The Concept of Uncontrolled or Assignable Cause Variation

When all assignable causes have been removed and prevented from reoccurring in the future so that only a single, constant system of chance causes remains, we have a stable and predictable process.

Stability of a process with respect to any given attribute is determined by measuring the attribute and tracking the results over time. If one or more measurements fall outside the range of chance variation, or if systematic patterns are apparent, the process may not be stable. We must then look for the causes of deviation, and remove any that we find, if we want to achieve a stable and predictable state of operation.

When a process is stable, 99+% of process performance variation will fall within 3 sigma of the mean or average of the variation. When the process variation falls outside of the 3 sigma limits, the variation is very likely caused by an anomaly in the process.

When a process is stable, or nearly so, the 3 sigma limits determine the amount of variation that is normal or natural to the process. This is the "voice of the process" or the process telling us what it is capable of doing. This may or may not be satisfactory to the customer: if it is, it is "capable"; if it is not, the process must be changed since we know that the remaining variation is due to the process itself.

Three Important Factors

Before we look at an example, there are three important notions that should be discussed

  1. the importance of operational definitions
  2. homogeneity
  3. issues of rational subgrouping

The need for operational definitions is fundamental to any measurement activity. It is not enough to identify measures. Measures must be defined in such a way as to tell others exactly how each measure is obtained so that they can collect and interpret the values correctly.

The primary issue is not whether a definition for a measure is correct, but that everyone understands, completely, what the measured values represent. Only then can people be expected to collect values consistently and have others interpret and apply the results to reach valid conclusions.

Communicating clear and unambiguous definitions is not easy. Having structured methods for identifying all the rules that are used to make and record measurements can be very helpful in ensuring that important information does not go unmentioned. When designing methods for defining measures, one should keep in mind that things that do not matter to one user are often important to another. This means that measurement definitions (and structures for recording the definitions) often become larger and more encompassing than the definitions most organizations have traditionally used. This is all the more reason to have a well-organized approach. Definition focuses on details, and structured methods help ensure that all details get identified, addressed, and recorded. They also help negotiating with people who believe that attention to detail is no longer their responsibility.

Operational definitions must satisfy two important criteria [Park 92]

  1. communication. If someone uses the definition as a basis for measuring or describing a measurement result, will others know precisely what has been measured, how it was measured, and what has been included and excluded?
  2. repeatability. Could others, armed with the definition, repeat the measurements and get the same results?

These criteria are closely related. In fact, if you can't communicate exactly what was done to collect a set of data, you are in no position to tell someone else how to do it. Far too many organizations propose measurement definitions without first determining what users of the data will need to know about the measured values in order to use them intelligently. It is no surprise, then, that measurements are often collected inconsistently and at odds with users' needs. When it comes to implementation, rules such as, "Count all noncomment, nonblank source statements" or "Count open problems" are open to far too many interpretations to provide repeatable results

Although communicating measurement definitions in clear, unambiguous terms requires effort, there is good news as well. When someone can exactly describe what has been collected, it is easy to turn the process around and say, "Please do that again." Moreover, you can give the description to someone else and say, "Please use this as your definition, but with these changes." In short, when we can communicate clearly what we have measured, we have little trouble creating repeatable rules for collecting future data.

Next, the notions of homogeneity and rational subgrouping need to be understood and addressed. Homogeneity and rational subgrouping go hand in hand. Because of the non-repetitive nature of software products and processes, some believe it is difficult to achieve homogeneity with software data. The idea is to understand the theoretical issues and at the same time, work within some practical guidelines. We need to understand what conditions are necessary to consider the data homogeneous. When more than two data values are placed in a subgroup, we are making a judgement that these values are measurements taken under essentially the same conditions, and that any difference between them is due to natural or common variation. The primary purpose of homogeneity is to limit the amount of variability within the subgroup data. One way to satisfy the homogeneity principle is to measure the subgroup variables within a short time period. Since we are not talking about producing widgets but software products, the issue of homogeneity of subgroup data is a judgement call that must be made by one with extensive knowledge of the process being measured.

The principle of homogeneously subgrouped data is important when we consider the idea of rational subgrouping. That is, when we want to estimate process variability, we try to group the data so that assignable causes are more likely to occur between subgroups than within them. Control limits become wider and control charts less sensitive to assignable causes when containing non-homogeneous data. Creating rational subgroups that minimize variation within subgroups always takes precedence over issues of subgroup size.

Using Control Charts

Now let's examine how control charts can be used to investigate process stability and lead to process improvement. There are a number of different kinds of control charts (please see [Florac 99] for a more detailed discussion on this and other topics). In software environments, measurements often occur only as individual values. As a result, there may be a preference to using the individuals and moving range (XmR) charts to examine the time-sequenced behavior of process data.

For example, the figure below shows an XmR control chart for the number of reported but unresolved problems backlogged over the first 30 weeks of system testing. The chart indicates that the problem resolution process is stable, and that it is averaging about 20 backlogged problems (the center line, CL, equals 20.4), with an average change in backlog of 4.35 problems from week to week. The upper control limit (UCL) for backlogged problems is about 32, and the lower control limit (LCL) is about 8. If future backlogs were to exceed these limits or show other forms of nonrandom behavior, it would be likely that the process has become unstable. The causes should then be investigated. For instance, if the upper limit is exceeded at any point, this could be a signal that there are problems in the problem-resolution process. Perhaps a particularly thorny defect is consuming resources, causing problems to pile up. If so, corrective action must be taken if the process is to be returned to its original (characteristic) behavior.


Figure 3: Control Chart for the Backlog of Unresolved Problems

We must be careful not to misinterpret the limits on the individual observations and moving ranges that are shown in the control chart. These limits are estimates for the limits of the process, based on measurements of the process performance. The process limits together with the center lines are sometimes referred to as the "voice of the process."

The performance indicated by the voice of the process is not necessarily the performance that needs to be provided to meet the customer's requirements. If the variability and location of the measured results are such that the processdoes not meet the customer requirement or specification (e.g., produces too many nonconforming products), the process must be improved. This means reducing the process performance variability, moving the average, or both.

Usage Considerations

1. When analyzing process performance data, all sources of variation in the process must be identified. If a conscious effort is not made to account for the potential sources of variation, variations that could help to improve the process might inadvertantly be hidden or obscured. Even worse, it could lead to a faulty analysis. When data are aggregated, the results will be particularly susceptible to overlooked or hidden sources of variation. Overly aggregated data come about in many ways, but the most common causes are

  • inadequately formulated operational definitions of product and process measures
  • inadequate description and recording of context information
  • lack of traceability from data back to the context from where it originated
  • working with data whose elements are combinations (mixtures) of values from non-homogeneous sources or different cause systems

Overly aggregated data easily lead to:

  • difficulty in identifying instabilities in process performance
  • difficulty in tracking instabilities to assignable causes
  • using results from unstable processes to draw inferences or make predictions about capability or performance
  • anomalous process behavior patterns

2. When measured values of continuous variables have insufficient granularity (i.e., are coarse and imprecise), the discreteness that results can mask the underlying process variation. Computations for and sigma can then be affected, and individual values that are rounded or truncated in the direction of the nearest control limit can easily give false out-of-control signals.

There are four main causes of coarse data: inadequate measurement instruments, imprecise reading of the instruments, rounding, and taking measurements at intervals that are too short to permit detectable variation to occur. When measurements are not obtained and recorded with sufficient precision to describe the underlying variability, digits that contain useful information will be lost. If the truncation or rounding reduces the precision in recorded results to only one or two digits that change, the running record of measured values will show only a few levels of possible outcomes.

3. Control charts can be used to serve many different purposes. Control charts can be helpful for monitoring processes from release to release to compare overall performance. They can be used for making process adjustments to ensure that stability is maintained for a process on a daily or weekly basis. Most importantly control charts may be used for continuous improvement of a process that is stable and capable. It is important to keep in mind however, that the control charts provide the most value to the people or team where the process knowledge resides.

Management can also help set the example of how not to use the control charts. While the control charts can be used to improve personal performance, management should not misuse this tool or the data. Management has to remember that the old saw "we will continue the beatings until morale improves," comes into play whenever measurements are used as part of the "beating." Clearly, dysfunctional behavior is likely to occur if employees perceive that measurements are being used in this way

There is evidence that Shewhart's control charts can play a significant role in measuring process performance consistency, and process predictability . Successful implementers of this process recognize the importance to1) understand the concepts of variation, data homogeneity, common cause systems, and rational subgrouping, and 2) fully understand the process and subprocesses being measured. Furthermore, they have used the control charts to measure process performance at the subprocess (and lower) level realizing that there is far too much variation in the overall process to be helpful in identifying possible actions for improvement.

These software organizations have come to appreciate the value added when control charts are used to provide engineers and managers with quantitative insights into the behavior of their software development processes. In many ways the control chart is a form of instrumentation. Much like an oscilloscope, a temperature probe, or a pressure gauge, it provides data to guide decisions and judgements by process knowledgeable engineers and managers.

Maturity

While SPC is not a new technology, (i.e., this technique has been applied in manufacturing for years) it is just recently being applied to address software engineering improvement. Organizations are starting to become aware of SPC, getting appropriate training, and starting to apply SPC. To get started, many organizations are analyzing inspection data using SPC.

 

References and Information Sources

[Austin 96]

Robert D. Austin, Measuring and Managing Performance in Organizations, Dorset House Publishing, ISBN: 0-932633-36-6, New York, NY, 1996.

[Basili 92]

V.R. Basili, Software Modeling and Measurement: The Goal/Question/Metric Paradigm, University of Maryland, CS-TR-2956, UMIACS-TR-92-96, 1992.

[Brassard 94]

Michael Brassard and Diane Ritter, The Memory Jogger II, GOAL/QPC, Methuen, MA, 1994.

[Burr 96]

Adrian Burr and Mal Owen, Statistical Methods for Software Quality, ISBN 1-85032-171-X, International Thomson Computer Press, Boston, MA, 1996.

[Deming 86]

W. Edwards Deming, Out of the Crisis, MIT Center for Advanced Engineering Study, Cambridge, MA, 1986.

[Florac 99]

William A. Florac and Anita D. Carleton, Measuring the Software Process: Statistical Process Control for Software Process Improvement, Addison &endash;Wesley, 1999.

[Hare 95]

Lynne B. Hare, Roger W. Hoerl, John D. Hromi, and Ronald D. Snee, The Role of Statistical Thinking in Management, ASQC Quality Progress, Vol. 28, No. 2, February 1995, pp. 53-60.

[Humphrey 95]

Watts S. Humphrey, A Discipline for Software Engineering, ISBN 0-201-54610-8, Addison-Wesley Publishing Company, Reading, MA, 1995.

[Ishikawa 86]

K. Ishikawa, Guide to Quality Control, Asian Productivity Organization, Tokyo, Japan, (available from Unipub - Kraus International Publications, White Plains, NY) 1986.

[Paulk 95]

Carnegie Mellon University, Software Engineering Institute (Principal Contributors and Editors: Mark C. Paulk, Charles V. Weber, Bill Curtis, and Mary Beth Chrissis), The Capability Maturity Model: Guidelines for Improving the Software Process, ISBN 0-201-54664-7, Addison-Wesley Publishing Company, Reading, MA, 1995.

[Wheeler 92]

Donald J. Wheeler and David S. Chambers, Understanding Statistical Process Control, Second Edition, SPC Press, Knoxville, TN, 1992.

[Wheeler 98]

Donald J. Wheeler and Sheila R. Poling, Building Continual Improvement: A Guide for Business, SPC Press, Knoxville, TN, 1998.

Current Author/Maintainer

Anita Carleton, SEI

External Reviewers

Bill Florac, SEI

Modifications

February 28, 2001: Original



The Software Engineering Institute (SEI) is a federally funded research and development center sponsored by the U.S. Department of Defense and operated by Carnegie Mellon University.

Copyright 2007 by Carnegie Mellon University
Terms of Use
URL: http://www.sei.cmu.edu/str/descriptions/spc_body.html
Last Modified: 11 January 2007