Automated Code Repair
Created September 2017
Finding security flaws in source code is daunting; fixing them is an even greater challenge. We are creating automated tools that can repair bugs automatically or that prompt developers for more information to make effective repairs.
Vast Amounts of Code Have Many Security Vulnerabilities
CERT Division Source Code Analysis Laboratory (SCALe) reviews of software from the U.S. Department of Defense (DoD) and other sources show that most software contains many vulnerabilities. Most security flaws are caused by simple coding errors. Static analysis tools, typically used late in the development process, produce a huge number of diagnostics. Even after excluding false positives, the volume of true positives can overwhelm the abilities of development teams to fix the code. Consequently, the team eliminates only a small percentage of the vulnerabilities. Meanwhile, the existing installed codebases in the DoD now consist of billions of lines of C code that contain an unknown number of security vulnerabilities.
Most analyzers provide basic diagnostics but do not provide automated fixes or code modifications. Integrated development environments (IDEs), such as Eclipse, offer some automated code modification. Some IDEs fix code that has specific compilation errors, such as Quick Fixes in Eclipse. While IDEs provide some refactoring options, they are not intended to change the behavior of the code; instead they improve some aspect of the design.
Existing techniques for addressing security problems in code often require programmers to add more information—such as annotations and attributes—that can then be post-processed. These techniques are effective when developing new code, but they have the same practical limitations that manually address thousands of diagnostics in existing programs. We need a better way to fix existing code.
Collaborators
Our CERT Secure Coding team members are engaging DoD Software Assurance Community of Practice members. We have engaged with CERDEC to provide feedback and technology transition. Specifically, CERDEC will evaluate the integer-overflow repair tool on DoD codebases.
Our Solution: Automated Tools Look for Vulnerabilities and Fix Them
Our experience examining code shows that many security-relevant bugs follow common patterns that tools can automatically detect. There are corresponding patterns for repairing these bugs that tools can perform using automatic program transformation. We are developing automated source-code transformation tools to remediate vulnerabilities in code that are caused by violations of rules in the CERT Secure Coding Standards.
These tools convert noncompliant code into code that complies with the CERT standards. They reduce vulnerabilities without the need for developers to manually review thousands of diagnostics produced by static analysis tools. Sometimes our tools repair a bug completely automatically. In other cases, it prompts developers for more information when a little manual intervention can result in an effective repair.
We based our automated repair work on three premises:
- Many security bugs follow common patterns.
- By recognizing a pattern, a tool can make a reasonable guess about the developer's intention. We call this the inferred specification.
- A tool can repair the code to satisfy the inferred specification.
For example, malloc is a function that allocates a chunk of memory and returns a pointer to it. One common pattern of security bugs is a memory allocation such as “p = malloc(n * sizeof(T)),” where n is attacker-controlled. If n is too large, integer overflow occurs, and too little memory gets allocated, setting the stage for a buffer overflow. The inferred specification in the malloc case would be “Try to allocate enough memory to hold n objects of type T.” The tool inserts code to check whether overflow occurs and to simulate malloc returning NULL due to insufficient memory if overflow does occur.
To develop our automated code repair tool, we extended Rose, a framework for source code transformation. Our goal is to reduce the number of rule violations that require manual inspection by two orders of magnitude—from thousands to tens. At this scope, a development team can mitigate all unhandled violations. Automated code repair reduces a system’s attack surface and improves its ability to withstand cyber attacks while sustaining critical functions.
Software and Tools
SCALe Collection
August 2018
The CERT Division's Source Code Analysis Laboratory (SCALe) offers conformance testing of C and Java language software systems against the CERT C Secure Coding Standard and the CERT Oracle Secure Coding Standard for...
readSource Code Analysis Laboratory (SCALe)
March 2012
In this report, the authors describe the CERT Program's Source Code Analysis Laboratory (SCALe), a conformance test against secure coding...
readLearn More
Combined Analysis for Source Code and Binary Code for Software Assurance
November 07, 2021 Presentation
William Klieber
This research highlight how to increase software assurance of binary components by analyzing and repairing...
watchCombined Analysis for Source Code and Binary Code for Software Assurance
November 04, 2021 Video
William Klieber
This short video provides an introduction to a research topic presented at the SEI Research Review...
watchAutomated Code Repair to Ensure Spatial Memory Safety
June 01, 2021 Presentation
William Klieber, Ruben Martins, Ryan Steele, Matt Churilla, Mike McCall, David Svoboda
In this presentation, the authors discuss a technique for repairing C code to protect against potential violations of spatial memory...
watchAutomated Code Repair for Memory Safety
December 15, 2020 Video
William Klieber, Lori Flynn, Robert Schiela
This work aims to develop techniques to eliminate security vulnerabilities at a lower cost than manual...
watchAutomated Code Repair to Ensure Memory Safety (video)
November 11, 2019 Video
William Klieber
Watch SEI principal investigator Dr. Will Klieber discuss research to design and implement a technique to automatically repair all potential violations of memory safety in the source code so that the program is provably...
watchAutomated Code Repair to Ensure Memory Safety in C Source Code (2019)
October 28, 2019 Poster
William Klieber
This is a poster reflecting research to automatically repair C source code to eliminate memory-safety...
readPrioritizing Alerts from Static Analysis to Find and Fix Code Flaws
June 05, 2016 Blog Post
Lori Flynn
This SEI Blog post explores the importance of prioritizing alerts from static analysis tools to effectively identify and fix code flaws in software...
read