Abstract: “Good tests are important in software development, but it can be hard to tell whether tests will reveal future faults that are themselves unknown. Mutation analysis, which checks whether tests reveal inserted changes in a program, is a strong measure of test suite adequacy, but common source- or compiler- level approaches to mutation testing are not applicable to software available only in binary form. We explore mutation analysis as an application of the reassembleable disassembly approach to binary rewriting, building a tool for x86 binaries on top of the previously-developed Uroboros system, and apply it to the C benchmarks from SPEC CPU 2006 and to five examples of embedded control software. The results demonstrate that our approach works effectively across these software domains: as expected, tests designed for performance benchmarking reveal fewer mutants than tests generated to achieve high code coverage, but mutation scores indicate differences in test origins and features such as code size and fault-tolerance. Our binary-level tool also achieves comparable results to source-level mutation analysis despite supporting a more limited set of mutation operators. More generally we also argue that our experience shows how reassembleable disassembly is a valuable approach for constructing novel binary rewriting tools.”
More like this (3)
A mutation in the novel coronavirus has led to a new strain viewed as more contagious...A mutation in the novel coronavirus has led to a new strain viewed as more contagious than the virus that emerged from China, according to a study led by Los Alamos researchers.
Abstract: “We present JQF, a platform for performing coverage-guided fuzz testing in Java. JQF is designed...Abstract: “We present JQF, a platform for performing coverage-guided fuzz testing in Java. JQF is designed both for practitioners, who wish to find bugs in Java programs, as well as for researchers, who wish to implement new fuzzing algorithms. Practitioners write QuickCheck-style test methods that take inputs as formal parameters. JQF instruments the test program’s bytecode and continuously executes tests using inputs that are generated in a coverage-guided fuzzing loop. JQF’s input-generation mechanism is extensible. Researchers can implement custom fuzzing algorithms by extending JQF’s Guidance interface. A Guidance instance responds to code coverage events generated during the execution of a test case, such as function calls and conditional jumps, and provides the next input. We describe several guidances that currently ship with JQF, such as: semantic fuzzing with Zest, binary fuzzing with AFL, and complexity fuzzing with PerfFuzz. JQF is a mature tool that is open-source and publicly available. At the time of writing, JQF has been successful in discovering 42 previously unknown bugs in widely used open-source software such as OpenJDK, Apache Commons, and the Google Closure Compiler.” Comments
Abstract: “Static program analyses are routinely applied as the basis of code optimizations and to detect...Abstract: “Static program analyses are routinely applied as the basis of code optimizations and to detect safety and security issues in software systems. For their results to be reliable, static analyses should be sound (i.e., should not produce false negatives) and precise (i.e., should report a low number of false positives). Even though it is possible to prove properties of the design of a static analysis, ensuring soundness and precision for its implementation is challenging. Complex algorithms and sophisticated optimizations make static analyzers difficult to implement and test. In this paper, we present an automatic technique to test, among other properties, the soundness and precision of abstract domains, the core of all static analyzers based on abstract interpretation. In order to cover a wide range of test data and input states, we construct inputs by applying sequences of abstract-domain operations to representative domain elements, and vary the operations through gray-box fuzzing. We use mathematical properties of abstract do- mains as test oracles. Our experimental evaluation demonstrates the effectiveness of our approach. We detected several previously unknown soundness and precision errors in widely-used abstract domains. Our experiments also show that our approach is more effective than dynamic symbolic execution and than fuzzing the test inputs directly.” Comments