Evaluating Distributed Systems Architectures for Fault-Tolerant Applications

A large body of experience has been developed within the telecommunications industry with regard to fault-tolerant distributed systems architecture. This presentation focuses on key topics to consider in evaluating a proposed architecture for use in asynchronous, event-driven applications whose system quality attributes include stringent requirements for availability, reliability, and evolvability. A representative list of such topics includes

  • The processing model
  • Interprocess Communication
  • Redundancy Model
  • Fault Management and Recovery
  • Graceful Degradation Under Load
  • Operational Management and Maintenance
  • System Debugging Environment

Architecture and design patterns derived from best practices emerging from the telecommunications industry will be discussed in order to provide additional insight into proven architecture and design practices being used in deployed fault-tolerant commercial systems. In addition, there will be discussion about how these topics and patterns can be applied within the context of the SEI Architecture Tradeoff Analysis Method (ATAM) of software architecture evaluation. architecture evaluation.

PRESENTATION

Author

James Scott (Boeing Satellite Systems)

This presentation is related to the following area(s) of work:

Software Architecture

Software Engineering Institute
May 2008

For more information

Contact Us

info@sei.cmu.edu

412-268-5800