Portend: Drift Planning in ML Systems

While machine learning models (ML) can be very powerful, they are known to be brittle when presented with data outside their training distribution. The transition from inputs where a model performs well to inputs where a model performs poorly can often be abrupt and dramatic—a “digital cliff.” Potential causes for falling off the cliff can include data drift, concept drift, or simply the deployment of an operational system in a new environment. Complicating this problem, models often do not know when they are wrong, leading to potential disaster as systems blindly continue “over the cliff.”

At the end of this project, we will have advanced the state of the art by showing that drift planning reduces model failures.

Jeffery Hansen

Senior Machine Learning Research Scientist

To address this issue, we developed Portend, a set of tools that enable data scientists to build monitoring and alerting agents that detect and respond to data drift or out-of-distribution data in ML systems. Specifically, in Portend we are

creating a library of drift induction functions for simulating drifted data in a variety of common contexts relevant to image recognition that are of interest to intelligence, surveillance, and reconnaissance (ISR) scenarios
extending drift analysis and detection capabilities of the toolset developed in the FY21 Line-funded Augur project to include alerting and automating a set of system responses in production (e.g., warn, retrain, and take the system offline) to demonstrate timely drift monitoring in the context of image recognition
deploying the toolset in a simulated environment, representative of commonly used aerial surveillance platforms, to provide and validate a range of monitoring capabilities for both drifts that are anticipated and others that are low probability but high impact

At the end of this project, we will have advanced the state of the art by showing that drift planning reduces model failures. We will have advanced the state of the practice by improving image-based drone localization.

2023_Portend: Drift Planning in ML Systems

In Context: This FY2023-24 project

builds on the CMU SEI’s expertise in machine learning, software engineering, metric development, and operational relevance and extends the toolset, drift induction methods, and drift detection metrics implemented in the FY21 Augur Project
aligns with the CMU SEI technical objective to be trustworthy in construction and implementation and resilient in the face of operational uncertainties, including known and yet unseen adversary capabilities
aligns with the OUSD(R&E) critical technology priority of enabling operational effectiveness through trusted artificial intelligence (AI) and autonomy

Software Engineering Institute

Research Review 2023