Train but Verify: Towards Practical AI Robustness

The current challenges to the training and verification of secure machine learning (ML) stem from

the difficulty of enforcing quality attributes in a system that is trained on data instead of directly constructed from requirements
the fundamental advantage that an attacker has, namely that the attacker needs to only violate a single security policy, while the defender needs to enforce all of the security policies

The DoD has not been exempt from these challenges. The current state of the art in secure ML is to train systems to either enforce a single security policy or train auxiliary systems to detect violations of a single security policy. Very little extant work focuses on multiple security policies. For example, there exist systems in the DoD that make high-stakes decisions and yet were also trained on sensitive data. This implies that the system should enforce at least two security policies simultaneously (i.e., the ML system should neither do the wrong thing when presented with adversarial input nor reveal sensitive information about the training data during its operation).

…the ML system should neither do the wrong thing when presented with adversarial input nor reveal sensitive information about the training data during its operation.

In this “Train, but Verify” project, we will attempt to address the gap in the state of the art on secure training of ML systems with two objectives:

Train secure AI systems by training ML models to enforce at least two security policies.
Verify the security of AI systems by testing against declarative, realistic threat models.

2021_Train but Verify: Towards Practical AI Robustness

Juneberry, an architecture we created to support our research on Train, but Verify, improves the experience of machine learning experimentation by providing a framework for automating the training, evaluation, and comparison of multiple models against multiple datasets, reducing errors and improving reproducibility. We’ve made Juneberry available to other researchers on GitHub: https://github.com/cmu-sei/Juneberry

We consider security policies from the Beieler taxonomy: Ensure that an ML system does not learn the wrong thing during training (e.g., data poisoning), do the wrong thing during operation (e.g., adversarial examples), or reveal the wrong thing during operation (e.g., model inversion or membership inference).

In Context

This FY2020-22 project

aligns with the CMU SEI technical objective to be trustworthy in construction and implementation and resilient in the face of operational uncertainties, including known and yet unseen adversary capabilities

Software Engineering Institute

Research Review 2021