search menu icon-carat-right cmu-wordmark

Using Automation to Prioritize Alerts from Static Analysis Tools

Validating and repairing defects discovered by static analysis tools can require more human effort from auditors and coders than organizations have. CERT researchers are developing a method to automatically classify and prioritize alerts to help auditors and coders address large volumes of alerts with less effort.

The Static Analysis Challenge: Sorting the Problems You Have from the Problems You Don’t Have

Federal agencies and other organizations face an overwhelming number of security challenges in their software. Static analysis (SA) tools attempt to automatically identify defects in software products, including those that could lead to security vulnerabilities. These tools define a set of conditions for a well-behaved program.

SA tools analyze the program to find violations of those conditions by examining possible data flows and control flows, without executing the program. Then they produce diagnostic messages, or alerts, about purported flaws in the source code. According to organizational priorities, human auditors then evaluate the validity of these alerts and repair confirmed flaws.

SA tool producers improve the methods their tools use to check for code flaws, devising new algorithms that analyze faster, use less memory or disk space, work more precisely, or find more true positives. Tool producers also work to increase their coverage of code flaw taxonomies such as CWE, SEI CERT Coding Standards, and MISRA C.

As software assurance tools identify more kinds of code flaws, more true flaws and false positives are reported as alerts. SA tools also exhibit false negatives, meaning they sometimes do not produce a warning when a true code flaw exists. Development organizations attempt to address this problem by running multiple SA tools on each codebase to increase the types of code flaws they can find. However, this approach compounds the problem of having too many alerts—both true and false positives.

For most large codebases, using just one or two general SA tools generates too many alerts for a team to address within a project’s budget and schedule. Auditors and coders urgently need an automated method to classify true- and false-positive alerts. They also need automated support to organize and prioritize alerts from SA tools to manually evaluate and address them effectively and efficiently.

Our Collaborators

Three DoD organizations have agreed to provide sanitized alert audit data to support our research. We also work with collaborators at MITRE on mappings between SEI CERT Coding Standards and Common Weakness Enumerations (CWEs). Dr. Claire Le Goues, from the Carnegie Mellon University Computer Science Department, serves as an advisor with expertise on assuring high-quality software systems.

Using Automation to Prioritize Alerts from Static Analysis Tools Collaboration image

Our Automated Solution: Classifying Alerts and Prioritizing Them for Action

We began with our audit archives from previous static analyses of 20 codebases by the CERT Source Code Analysis Laboratory (SCALe) code conformance service. The SCALe tool uses multiple commercial, open-source, and experimental SA tools to analyze codebases for potential flaws. By using multiple tools, SCALe finds more detected code defects than any single SA tool would find. To expand our data set, three DoD collaborators provided sanitized audit data from their own codebases, analyzed by an enhanced research prototype version of SCALe.

Our solution fuses alerts from different tools for the same code flaw at the same location. Fusion requires mapping alerts from different tools to a code flaw taxonomy; in 2016, we used SEI CERT Coding Standards. The script we created performs fusion and performs additional analysis by counting alerts per file, alerts per function, and the depth of affected files within the code.

CERT researchers use the results of this analysis as “features” for the classifiers. Features are types of data that are analyzed by the mathematical algorithms for the classifiers. They include data gathered by code metrics tools and general SA tools about the program, file, function, and other categories relevant to each alert. These features help us develop more accurate classifiers.

We classify alerts into one of three categories:

  • expected true positive (e-TP)
  • expected false positive (e-FP)
  • indeterminate (I)

We assign membership to one of these classes by using probabilities the classifiers produce and user-specified thresholds. Using these assigned classifications, auditors could then put e-TPs into a set of code flaws to be fixed, ignore e-FPs, and prioritize I alerts for manual auditing according to level of confidence, cost to repair, and estimated risk if not repaired.

In 2016, we created two types of classifiers:

  1. all data, with the rule names used as a feature
  2. per rule, which uses only data with alerts mapped to that particular coding rule

(See the SEI blog post Prioritizing Security Alerts: A DoD Case Study for more details.)

Using the largest data set, the all-data classifiers ranged between 88% and 91% precision. For the single-rule classifiers, only three had sufficient data to have confidence in the classifier predictions.

Now our work focuses on rapidly increasing the number of per-condition classifiers using conditions from two taxonomies: CWE and SEI CERT Coding Standards.

Looking Ahead

We want our research to result in tools that automate the process of classifying alerts, routing alerts to appropriate work groups, and prioritizing indeterminate alerts. We plan to integrate automated classifiers with SCALe, so it filters and prioritizes alerts using the classification and prioritization scheme from this research. Our goal is more secure code and lower costs, enabled by efficient direction of human efforts for manual alert auditing and code repair.

Learn More

Challenges and Progress: Automating Static Analysis Alert Handling with Machine Learning

Challenges and Progress: Automating Static Analysis Alert Handling with Machine Learning

April 23, 2018 Presentation
Lori Flynn

Lori Flynn describes some of the accomplishments and challenges of the FY16-17-18 classifier research she led.

read
Prioritizing Security Alerts: A DoD Case Study

Prioritizing Security Alerts: A DoD Case Study

January 23, 2017 Blog Post
Lori Flynn

Federal agencies and other organizations face an overwhelming security landscape. The arsenal available to these organizations for securing software includes static analysis tools, which search code for flaws, including those that could lead to software vulnerabilities. The sheer effort required...

read
Static Analysis Alert Audits: Lexicon & Rules

Static Analysis Alert Audits: Lexicon & Rules

November 03, 2016 Conference Paper
Lori FlynnWilliam SnavelyDavid Svoboda

In this paper, the authors provide a suggested set of auditing rules and a lexicon for auditing static analysis alerts.

read
Prioritizing Alerts from Static Analysis with Classification Models

Prioritizing Alerts from Static Analysis with Classification Models

November 01, 2016 Presentation
Lori Flynn

In this presentation, Lori Flynn describes work toward an automated and accurate statistical classifier, intended to efficiently use analyst effort and to remove code flaws.

read
Prioritizing Alerts from Static Analysis with Classification Models

Prioritizing Alerts from Static Analysis with Classification Models

October 18, 2016 Poster
Lori Flynn

This poster describes CERT Division research on an automated and accurate statistical classifier.

read
Prioritizing Alerts from Static Analysis to Find and Fix Code Flaws

Prioritizing Alerts from Static Analysis to Find and Fix Code Flaws

June 06, 2016 Blog Post
Lori Flynn

In 2015, the National Vulnerability Database (NVD) recorded 6,488 new software vulnerabilities, and the NVD documents a total of 74,885 software vulnerabilities discovered between 1988-2016. Static analysis tools examine code for flaws, including those that could lead to software security...

read