icon-carat-right menu search cmu-wordmark

CaTE: Increasing Warfighter Trust in Autonomous Systems to Accelerate Adoption of Cutting-Edge Capabilities

Created October 2025

For the Department of Defense (DoD) to safely and quickly adopt new software capabilities, especially autonomous systems, we must improve every step in the development and engineering process to accelerate their availability while engendering trust in those new capabilities. To bridge the gap between cutting-edge research and practical implementation, the Under Secretary of Defense for Research and Engineering (USD(R&E)) asked the Software Engineering Institute (SEI) to pilot a program at the SEI called the Center for Trustworthy Measurement and Evaluation (CaTE), which concluded in 2025. CaTE’s findings helped develop trustworthy and trust-promoting systems so that the DoD can deploy these systems for our warfighters more quickly and safely.

Can Our Warfighters Trust AI Systems?

Artificial intelligence (AI) is powering the proliferation of autonomous systems, many of which provide unprecedented advantages to warfighters in the battlefield. However, these systems introduce new engineering challenges during development that can complicate the delivery of safe and reliable capabilities. Among these challenges is that we don’t have established standards to test and measure calibrated trust in these systems. Because AI is capable of making autonomous decisions, it is imperative that warfighters can trust the AI-enabled systems they use. Finding a way of testing trustworthiness before deployment is a key problem in ensuring that we can deliver reliable systems quickly.

At the request of USD(R&E), CaTE launched in 2023 as a pilot program to establish standards and guidance to improve human understandability of AI systems and how to develop and use them responsibly. The SEI completed its work on the CaTE pilot in the spring of 2025. The result of the work is collected in a series of deliverables that demonstrate how we can transition state-of-the-art, trust-promoting designs and trustworthy practices from research laboratories into operational use. Among those deliverables, the SEI produced the Reference Architecture for Assuring Ethical Conduct in LAWS as well as the CaTE Guidebook for the Development and TEVV of LAWS to Promote Trustworthiness, both of which serve as the main deliverables that organize all of CaTE’s resources, including those developed by our collaborators.

CaTE: Increasing Warfighter Trust in Autonomous Systems to Accelerate Adoption of Cutting-Edge Capabilities

Our Collaborators

During the CaTE pilot, the SEI partnered with the National Robotics Engineering Center (NREC) and Johns Hopkins University Applied Physics Laboratory (JHU/APL) to assemble comprehensive expertise in mission systems, robotics, autonomy, and AI and machine learning (ML). This collaboration provided the diverse skills and perspectives necessary to address the complex and demanding challenges of developing trustworthy autonomous systems.

The SEI and its partner organizations produced several key publications that outline our findings and best practices for implementing trustworthy AI systems in defense contexts. These publications provide actionable guidance for transitioning research innovations into operational capabilities. The publications developed by our partners include the following:

CaTE: Increasing Warfighter Trust in Autonomous Systems to Accelerate Adoption of Cutting-Edge Capabilities

CaTE Advances Trustworthiness for Autonomous Systems

The CaTE program pilot explored each of the following four key components:

  • developing foundational trust measures and integrating them into prototypes for real-world testing while maintaining protections for human subjects
  • investigating how these measures could calibrate assurance with users through technology-based solutions, tooling, and training programs
  • establishing a testing program accessible to the DoD, industry, and academic organizations to validate their technologies against these standards
  • experimenting with methods for machines to interpret human trust levels across different autonomy modes through behavioral observations.

This multi-faceted approach aimed to create practical frameworks for measuring and calibrating bidirectional roles and responsibilities in human-machine interactions to ultimately provide the foundation for more effective and trustworthy autonomous systems in defense applications.

The CaTE pilot investigated emerging practices in AI, ML, and autonomy, as well as technologies from across industry and academia. These innovations spanned critical areas including user-interface designs and human-machine interaction to enhance operator confidence. The pilot also studied advances in data curation and autonomous system design that improve system reliability, and, finally, CaTE highlighted improvements in ML model development alongside enhanced approaches to developmental testing, operational testing, evaluation, verification, and validation processes.

Through this comprehensive approach, CaTE demonstrated practical pathways for accelerating the adoption of trustworthy AI and autonomous systems in defense applications. Our findings resulted in the Reference Architecture for Assuring Ethical Conduct in LAWS—which provides a framework for creating systems that can embody and govern ethical principles through requirements and system design—as well as the CaTE Guidebook for the Development and TEVV of LAWS to Promote Trustworthiness—which provides operational test and evaluation (OT&E) and developmental test and evaluation (DT&E) personnel with observations and recommendations for effectively developing, testing, evaluating, verifying, and validating (TEVV) lethal autonomous weapons systems (LAWS).

CaTE: Increasing Warfighter Trust in Autonomous Systems to Accelerate Adoption of Cutting-Edge Capabilities

Software and Tools

MatchnMetric

A CaTE metric and algorithm that boosts ATR system robustness through simulated perfect tracking.

Learn More

Mutation-Based Safety Testing for Autonomy: MOBSTA

Robustness testing for your ROS robots.

Learn More

Learn More

Center for Calibrated Trust Measurement and Evaluation (CaTE)—Guidebook for the Development and TEVV of LAWS to Promote Trustworthiness

White Paper

This guidebook supports personnel in the development and testing of autonomous weapon systems that employ ML, focusing on system reliability and operator trust.

Read

Reference Architecture for Assuring Ethical Conduct in LAWS

White Paper

This reference architecture provides guidance to reason about designing and developing ML-enabled autonomous systems that have the capability to use lethal force.

Read