2021 Research Review / DAY 2
Combined Analysis for Source Code and Binary Code for Software Assurance
Many DoD entities need software assurance for both source code and binary code, as well as mixed systems (e.g., source code plus binary libraries). While there are many existing highly capable tools for static analysis of source code, tools for software assurance of binaries are fewer and much more limited. The objective of this line of work is to evaluate the feasibility of decompiling binaries for the purpose of (1) static analysis and (2) localized repairs to functions of the binary. More specifically, we aim to (1) develop a tool for determining whether individual functions have been correctly decompiled, (2) measure what percentage of functions are decompiled correctly on typical real-world binary code, and (3) measure how close static analysis on decompiled code approximates static analysis on the original source code.
We adapt an existing open-source decompiler (in particular, Ghidra) to produce decompiled code suitable for static analysis and/or repair, and we evaluate it with real-world (optimized) binary files. This project lays the groundwork for further work (including a follow-on FY22 project) to (1) enable the DoD to more accurately perform software assurance for projects that include binary components and (2) develop a framework for making localized repairs (either manual or automated) to functions of a binary library or executable.
This line of work, if successful, will enable the DoD to find and fix potential vulnerabilities in binary code that might otherwise be cost prohibitive to investigate or repair, thereby increasing the trustworthiness of fielded software
This line of work, if successful, will enable the DoD to find and fix potential vulnerabilities in binary code that might otherwise be cost prohibitive to investigate or repair, thereby increasing the trustworthiness of fielded software. Our collaborators and interested transition partners at the DoD have binaries for which software assurance is desired; they will help us to evaluate and improve our tool, and they will be able to benefit from using the tool in practice when it is ready.
This FY2021 Project
- builds on DoD line-funded research on automated repair of code for integer overflow, inference of memory bounds, and automated code repair to ensure memory safety
- aligns with the CMU SEI technical objective to make software trustworthy in construction, correct in implementation, and resilient in the face of operational uncertainties, including known and yet-unseen adversary capabilities