2023 Research Review
Portend: Drift Planning in ML Systems
While machine learning models (ML) can be very powerful, they are known to be brittle when presented with data outside their training distribution. The transition from inputs where a model performs well to inputs where a model performs poorly can often be abrupt and dramatic—a “digital cliff.” Potential causes for falling off the cliff can include data drift, concept drift, or simply the deployment of an operational system in a new environment. Complicating this problem, models often do not know when they are wrong, leading to potential disaster as systems blindly continue “over the cliff.”
At the end of this project, we will have advanced the state of the art by showing that drift planning reduces model failures.
Dr. Jeffery HansenSenior Machine Learning Research Scientist
To address this issue, we developed Portend, a set of tools that enable data scientists to build monitoring and alerting agents that detect and respond to data drift or out-of-distribution data in ML systems. Specifically, in Portend we are
- creating a library of drift induction functions for simulating drifted data in a variety of common contexts relevant to image recognition that are of interest to intelligence, surveillance, and reconnaissance (ISR) scenarios
- extending drift analysis and detection capabilities of the toolset developed in the FY21 Line-funded Augur project to include alerting and automating a set of system responses in production (e.g., warn, retrain, and take the system offline) to demonstrate timely drift monitoring in the context of image recognition
- deploying the toolset in a simulated environment, representative of commonly used aerial surveillance platforms, to provide and validate a range of monitoring capabilities for both drifts that are anticipated and others that are low probability but high impact
At the end of this project, we will have advanced the state of the art by showing that drift planning reduces model failures. We will have advanced the state of the practice by improving image-based drone localization.
In Context: This FY2023-24 project
- builds on the CMU SEI’s expertise in machine learning, software engineering, metric development, and operational relevance and extends the toolset, drift induction methods, and drift detection metrics implemented in the FY21 Augur Project
- aligns with the CMU SEI technical objective to be trustworthy in construction and implementation and resilient in the face of operational uncertainties, including known and yet unseen adversary capabilities
- aligns with the OUSD(R&E) critical technology priority of enabling operational effectiveness through trusted artificial intelligence (AI) and autonomy
Principal Investigator
Dr. Jeffery Hansen
Senior Machine Learning Research Scientist
CMU Collaborators
Lena Pons
Software Architecture and AI Researcher
Sebastian Echeverria
Senior Software Engineer
David Walbeck
Senior Software Engineer
Dr. Eric Heim
Senior Research Scientist - Machine Learning
Watch the Recording
Have a Question?
Reach out to us at info@sei.cmu.edu.