search menu icon-carat-right cmu-wordmark

Using Automation to Prioritize Alerts from Static Analysis Tools

Created September 2017

Validating and repairing defects discovered by static analysis tools can require more human effort from auditors and coders than organizations have. CERT researchers are developing a method to automatically classify and prioritize alerts to help auditors and coders address large volumes of alerts with less effort.

The Static Analysis Challenge: Sorting the Problems You Have from the Problems You Don’t Have

Federal agencies and other organizations face an overwhelming number of security challenges in their software. Static analysis (SA) tools attempt to automatically identify defects in software products, including those that could lead to security vulnerabilities. These tools define a set of conditions for a well-behaved program.

SA tools analyze the program to find violations of those conditions by examining possible data flows and control flows, without executing the program. Then they produce diagnostic messages, or alerts, about purported flaws in the source code. According to organizational priorities, human auditors then evaluate the validity of these alerts and repair confirmed flaws.

SA tool producers improve the methods their tools use to check for code flaws, devising new algorithms that analyze faster, use less memory or disk space, work more precisely, or find more true positives. Tool producers also work to increase their coverage of code flaw taxonomies such as CWE, SEI CERT Coding Standards, and MISRA C.

As software assurance tools identify more kinds of code flaws, more true flaws and false positives are reported as alerts. SA tools also exhibit false negatives, meaning they sometimes do not produce a warning when a true code flaw exists. Development organizations attempt to address this problem by running multiple SA tools on each codebase to increase the types of code flaws they can find. However, this approach compounds the problem of having too many alerts—both true and false positives.

For most large codebases, using just one or two general SA tools generates too many alerts for a team to address within a project’s budget and schedule. Auditors and coders urgently need an automated method to classify true- and false-positive alerts. They also need automated support to organize and prioritize alerts from SA tools to manually evaluate and address them effectively and efficiently.

Our Collaborators

Three DoD organizations have agreed to provide sanitized alert audit data to support our research. We also work with collaborators at MITRE on mappings between SEI CERT Coding Standards and Common Weakness Enumerations (CWEs). Dr. Claire Le Goues, from the Carnegie Mellon University Computer Science Department, serves as an advisor with expertise on assuring high-quality software systems.

Using Automation to Prioritize Alerts from Static Analysis Tools Collaboration image

Our Automated Solution: Classifying Alerts and Prioritizing Them for Action

We began with our audit archives from previous static analyses of 20 codebases by the CERT Source Code Analysis Laboratory (SCALe) code conformance service. The SCALe tool uses multiple commercial, open-source, and experimental SA tools to analyze codebases for potential flaws. By using multiple tools, SCALe finds more detected code defects than any single SA tool would find. To expand our data set, three DoD collaborators provided sanitized audit data from their own codebases, analyzed by an enhanced research prototype version of SCALe.

Our solution fuses alerts from different tools for the same code flaw at the same location. Fusion requires mapping alerts from different tools to a code flaw taxonomy; in 2016, we used SEI CERT Coding Standards. The script we created performs fusion and performs additional analysis by counting alerts per file, alerts per function, and the depth of affected files within the code.

CERT researchers use the results of this analysis as “features” for the classifiers. Features are types of data that are analyzed by the mathematical algorithms for the classifiers. They include data gathered by code metrics tools and general SA tools about the program, file, function, and other categories relevant to each alert. These features help us develop more accurate classifiers.

We classify alerts into one of three categories:

  • expected true positive (e-TP)
  • expected false positive (e-FP)
  • indeterminate (I)

We assign membership to one of these classes by using probabilities the classifiers produce and user-specified thresholds. Using these assigned classifications, auditors could then put e-TPs into a set of code flaws to be fixed, ignore e-FPs, and prioritize I alerts for manual auditing according to level of confidence, cost to repair, and estimated risk if not repaired.

In 2016, we created two types of classifiers:

  1. all data, with the rule names used as a feature
  2. per rule, which uses only data with alerts mapped to that particular coding rule

(See the SEI blog post Prioritizing Security Alerts: A DoD Case Study for more details.)

Using the largest data set, the all-data classifiers ranged between 88% and 91% precision. For the single-rule classifiers, only three had sufficient data to have confidence in the classifier predictions.

Now our work focuses on rapidly increasing the number of per-condition classifiers using conditions from two taxonomies: CWE and SEI CERT Coding Standards.

Software and Tools

SCALe Collection

August 2018

The CERT Division's Source Code Analysis Laboratory (SCALe) offers conformance testing of C and Java language software systems against the CERT C Secure Coding Standard and the CERT Oracle Secure Coding Standard for...

read

Source Code Analysis Laboratory (SCALe)

March 2012

In this report, the authors describe the CERT Program's Source Code Analysis Laboratory (SCALe), a conformance test against secure coding...

read

Looking Ahead

We want our research to result in tools that automate the process of classifying alerts, routing alerts to appropriate work groups, and prioritizing indeterminate alerts. We plan to integrate automated classifiers with SCALe, so it filters and prioritizes alerts using the classification and prioritization scheme from this research. Our goal is more secure code and lower costs, enabled by efficient direction of human efforts for manual alert auditing and code repair.

Learn More

Rapid Adjudication of Static Analysis Alerts During Continuous Integration

November 07, 2021 Presentation
Lori Flynn

This project developed algorithms and a static analysis classification system for use with continuous integration, enabling more secure software with less...

watch

Rapid Adjudication of Static Analysis Alerts During Continuous Integration

November 05, 2021 Video
Lori Flynn

This short video provides an introduction to a research topic presented at the SEI Research Review...

watch

SCAIFE and ACR: Static Analysis Classification and Automated Code Repair

September 15, 2021 Presentation
Lori Flynn, William Klieber

Flynn and Klieber describe their research and concept for a combined system for static analysis classification and automated code repair....

watch

Test Suites as a Source of Training Data for Static Analysis Classifiers

August 30, 2021 Video
Lori Flynn

This video by Lori Flynn was recorded as part of the ACM/IEEE International Conference on Automation of Software Test AST 2021 (co-located with ICSE)....

watch

Static Code Analysis Classification

December 15, 2020 Video
Lori Flynn, William Klieber, Robert Schiela

Progress in research toward the rapid adjudication of static analysis alerts during continuous...

watch

Rapid Adjudication of Static Analysis Alerts During Continuous Integration

December 15, 2020 Video
Lori Flynn, Robert Nord, Hasan Yasar

Progress in research toward the rapid adjudication of static analysis alerts during continuous...

watch

Automating Static Analysis Alert Handling with Machine Learning: 2016-2018

September 21, 2018 Presentation
Lori Flynn

This presentation was presented by author Lori Flynn to Raytheon's Systems and Software Assurance Technology Interest...

watch

Prioritizing Alerts from Multiple Static Analysis Tools, Using Classification Models

August 14, 2018 Conference Paper
Lori Flynn, William Snavely, David Svoboda, Nathan M. VanHoudnos, Richard Qin, Jennifer Burns, David Zubrow, Robert W. Stoddard, Guillermo Marce-Santurio

This paper was accepted by the SQUADE workshop at ICSE 2018. It describes the development of several classification models for the prioritization of alerts produced by static analysis tools and how those models were tested for accuracy....

read

Test Suites as a Source of Training Data for Static Analysis Alert Classifiers

April 29, 2018 Blog Post
Lori Flynn, Zachary Kurtz

Numerous tools exists to help detect flaws in code. Some of these are called flaw-finding static analysis (FFSA) tools because they identify flaws by analyzing code without running...

read

Challenges and Progress: Automating Static Analysis Alert Handling with Machine Learning

April 23, 2018 Presentation
Lori Flynn

Lori Flynn describes some of the accomplishments and challenges of the FY16-17-18 classifier research she led....

watch