CERT-SEI

Zero-Slack Scheduling

In complex military systems such as airplanes and UAVs, there is an increasing need to consolidate more and more functions into a single processor. This consolidation imposes a challenge when these functions have different criticalities. For instance, one function may be controlling the stability of the airplane while another may be processing video for a surveillance mission. While the function that controls the stability of the flight is safety-critical (i.e., a failure in this function can crash the airplane) the video processing function is not. The difference in criticality poses a challenge for the verification, validation, and certification of whole systems because functions of different criticality are held to different standards of certification. In particular, while a complex and expensive verification process may be used for safety-critical functions, a simplified and cheaper process must be applied to non-safety-critical functions in order to keep the cost under control. Unfortunately, if functions from different criticalities share a processor, failures in a low-critical function may propagate to a higher critical one--for example, because the faulty function may hold execute longer than expected, delaying the execution of the critical task beyond its tolerance (unable to compensate for cross winds). This means that if we want to preserve the quality of the verification of the safety-critical functions, we would need to apply the same complex verification process to the lower criticality functions, given the failure propagation possibility. Fortunately, if we can ensure that high-criticality tasks are protected against failures in lower criticality tasks, then we can still apply different verification processes to functions of different criticalities.

The Air Force Research Laboratory (AFRL) has recognized these challenges and created an initiative called the "Mixed-Criticality Architecture Requirements" (MCAR) to investigate the technology required to implement these protection mechanisms.

The Zero-Slack Mixed Criticality Scheduling is a scheduler that implements temporal protection (i.e., ensures that tasks are not delayed) of high-criticality tasks against lower criticality ones. In particular, during an overload, we ensure that higher criticality tasks are able to finish on time (meet their deadline) even at the expense of lower criticality ones.

Zero-Slack QRAM

Zero-slack scheduling is a scheduling framework for real-time systems of mixed criticality. Specifically, it targets systems where the utilization-based scheduling priorities are not aligned with the criticality of the tasks. With this framework, we implemented a family of schedulers, resource-allocation protocols, and synchronization protocols to support the scheduling of mixed-criticality systems.

The zero-slack QoS resource-allocation model (Q-RAM) combines zero-slack rate monotonic scheduling and Q-RAM to enable overbooking, in which the same CPU cycles are allocated to more than one task. Zero-slack Q-RAM allows overbooking not only between tasks of different criticality but also among tasks with different utility to the mission of the system. In a given cycle, if a more critical task must execute, that task uses the cycle; otherwise, a task of lower criticality will execute.

We developed several experiments to determine the effects of this scheduler in a drone mission. First, we demonstrated how the wrong scheduling can actually crash a drone. The following video shows how the increasing demands of lower-criticality tasks decrease flight safety.



Second, we show that in a full mission, zero-slack Q-RAM not only preserves the safety of the flight but also maximizes the utility of the mission. The demo in the following video shows a surveillance mission in which a video-streaming task and an object-recognition task are dynamically adjusted according to their utility to the mission.