Verifying Distributed, Adaptive Real-Time (DART) Systems
Created September 2017
Distributed, adaptive real-time (DART) systems could transform the military, transportation, energy, and health care. But they must satisfy safety-critical requirements. We developed a method to verify DART systems and generate assured code.
Verifying Systems of Mixed Criticality Is a Challenge
DART systems are cyber-physical systems consisting of several agents that communicate and coordinate to achieve their goals. They self-adapt to their environment to improve their likelihood of success. This smart adaptation promises to help the Department of Defense (DoD) achieve its missions in dynamic and uncertain environments. The ability to deploy DART systems as teams of distributed, coordinating agents with high confidence in mission success enhances the DoD's capability significantly. DART systems also have the potential to revolutionize transportation, energy, and health care.
To realize its potential, DART software must be engineered to have high assurance and be certified to operate safely and effectively. Systems must satisfy guaranteed, safety-critical requirements, such as collision avoidance among autonomous unmanned vehicles. These requirements must be guaranteed because failure can have life-threatening consequences. They must also use self-adaptation to achieve mission-critical requirements in contested environments. For example, in video coverage during surveillance, a DART system can tolerate the small failure of missing a frame in a video stream by adapting to work around the missing frame.
To help target our research to DoD-relevant problems, we work with Dr. Stanley Bak of the Air Force Research Laboratory at Wright-Patterson Air Force Base.
Our Solution Verifies DART Systems and Generates Assured Code
We developed an engineering method for producing high-assurance software for cyber-physical systems composed of multiple agents (such as a team of robots) that communicate, coordinate, and adapt to uncertain environments to achieve safety-critical and mission-critical goals. We combined model-driven engineering with evidence-generating analysis, language design, verification, and self-adaptation techniques and tools.
In the architecture of a DART system, each node must have two types of threads: (1) a safety-critical or high-critical thread (HCT) with guaranteed objectives even in the presence of overloads and a mission-critical and (2) a low-critical thread (LCT) with objectives that are guaranteed in the absence of overloads. Each thread implements periodic, real-time software. HCTs are temporally isolated from LCTs so they do not miss deadlines, even under unpredictable overload conditions. We achieved this temporal isolation with mixed-criticality scheduling. This architecture maps to safety-integrity levels proposed by safety-critical software certification standards, such as DO-178B.
Verification with Automated Analyses
Our new mixed-criticality temporal protection mechanisms isolate HCTs from LCTs that share hardware resources. Isolation manages contention for shared resources and prevents criticality-inversion violations, so an LCT cannot cause an HCT to miss its deadline. This method not only protects HCTs from LCTs but also allows HCTs to steal resources from LCTs if needed.
An algorithm based on software model checking rigorously verifies that a DART system meets its high-critical objectives. This scalable algorithm guarantees that HCTs are schedulable. It verifies only HCTs and abstracts away their environment (including LCTs and commercial, off-the-shelf components) conservatively.
We also developed a novel self-adaptation approach that helps DART systems achieve their low-critical objectives by syncing them with changes in the environment. The approach includes a model of the current and predicted environment based on local and remote information, a model of the current and predicted system reflecting the expected outcome of adaptations, and a latency- and uncertainty-aware adaptation algorithm.
The quality of the adaptation approach is evaluated via statistical model checking, which uses Monte Carlo simulations with known precision to estimate the probability that the adaptation will lead to a successful mission. We developed a distributed statistical model checker, called DEMETER, to perform this analysis of scalability using high-performance computing infrastructure.
We developed an executable, domain-specific language called the DART Modeling and Programming Language (DMPL). With DMPL, users can program DART systems, verify safety and probabilistic properties, prove real-time schedulability, and then generate code. Using DMPL, we developed multi-agent scenarios that incorporate verified collision-avoidance protocols, mixed-criticality real-time scheduling, and proactive self-adaptation algorithms. See the DART GitHub page for more information.
Exhaustively testing critical safety properties of unmanned autonomous systems to ensure that such systems will behave as expected is cost prohibitive. We collaborate with specific groups at AFRL to ensure that our research is targeted to DoD-relevant problems. Led by Dr. Stanley Bak, staff from AFRL Wright-Patterson have collaborated with us to integrate our engineering approach and analysis techniques with AFRL's work on developing and verifying control algorithms to produce a software stack for a high-assurance, end-to-end, cyber-physical system.
Modeling, Verifying, and Generating Software for Distributed Cyber-Physical Systems using DMPL and AADL.
October 06, 2016 Conference Paper
This paper provides an end-to-end framework where DART systems can be designed, analyzed, and implemented within the same toolchain. In this talk, the authors present this toolchain and demonstrate it on a few representative examples.read
Verifying Cyber-Physical Systems by Combining Software Model Checking with Hybrid Systems Reachability
October 01, 2016 Conference Paper
This work proposes a bridge between two important verification methods, software model checking and hybrid systems reachability.read
October 22, 2015 Poster
This poster describes the authors' research efforts in verifying distributed adaptive real-time systems.read
October 16, 2015 Presentation
This 2015 Research Review presentation describes the authors' research efforts in verifying distributed adaptive real-time systems.read
September 07, 2015 Conference Paper
This short paper introduces our architecture and approach to engineering a DART system so that we achieve high assurance in its runtime behavior against a set of formally specified requirements.read