skip to main content
research-article

Understanding and Improving Coverage Tracking with AFL++ (Registered Report)

Published: 13 September 2024 Publication History

Abstract

Coverage-based fuzzers track which program parts they visit when executing a specific input as a proxy measure to (1) guide the fuzzing process, and (2) explore the PUT's state space. One way to record coverage progress is to enumerate basic block pairs (e.g., edges in the control-flow graph) and use them to index into a hash table that holds counters. The counter is incremented every time a fuzzer's input exercises the corresponding edge. Traditionally the coverage map has been a compact bitmap that fits the L2 CPU cache to reduce runtime overhead and boost fuzzing throughput. In such a design where space is traded for speed, two sources of imprecision can arise: (1) collisions, and (2) arithmetic inaccuracies.
Collisions refer to the situation when two different basic block pairs hash to the same entry. Imprecision arises since one pair is now counted together, but the fuzzer cannot tell one apart from the other.
Arithmetic inaccuracies refer to errors in the counting strategy. For example, a monotonically incrementing counter inside the hash table can overflow. This indicates a situation where high-frequency control-flow exceeds the predefined, expected maximum counter size (e.g., in loops). Due to execution frequencies obeying exponential power laws, such overflows will affect a small number of hash table entries. Another arithmetic inaccuracy results from range-based counters that capture only predefined frequency intervals (e.g., logarithmic counters).
In 2018, CollAFL examined how collisions impact precision, and presented a new hashing scheme to reduce the number of collisions. CollAFL did not address the problem of arithmetic inaccuracies. Furthermore, CollAFL considered only a single-core virtual machine, a limited set of benchmark programs, and did not explore hardware-specific effects (e.g., cache utilization for concurrent fuzzing processes).
This registered report aims at providing new insights of how collisions and arithmetic inaccuracies affect coverage tracking for fuzzing. We propose experiments for multiple hardware architectures with different cache topologies, and a more diverse set of benchmark programs. Leveraging the evaluation data, our aim is to determine precise architecture-aware settings for AFL++. Furthermore, we plan to demonstrate an adaptive optimization strategy that optimizes the coverage map to collisions and counting strategies for a specific combination of the CPU architecture and PUT.

References

[1]
Alif Ahmed, Jason D. Hiser, Anh Nguyen-Tuong, Jack W. Davidson, and Kevin Skadron. 2021. BigMap: Future-proofing Fuzzers with Efficient Large Maps. In 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). 531–542. issn:2158-3927 https://doi.org/10.1109/DSN48987.2021.00062
[2]
Cornelius Aschermann, Sergej Schumilo, Tim Blazytko, Robert Gawlik, and Thorsten Holz. 2019. REDQUEEN: Fuzzing with Input-to-State Correspondence. In NDSS. 19, 1–15.
[3]
Christel Baier and Joost-Pieter Katoen. 2008. Principles of model checking. MIT press.
[4]
Pietro Borrello, Andrea Fioraldi, Daniele Cono D’Elia, Davide Balzarotti, Leonardo Querzoni, and Cristiano Giuffrida. 2024. Predictive Context-sensitive Fuzzing. In Proceedings 2024 Network and Distributed System Security Symposium. Internet Society, San Diego, CA, USA. isbn:978-1-891562-93-8 https://doi.org/10.14722/ndss.2024.24113
[5]
Peng Chen and Hao Chen. 2018. Angora: Efficient fuzzing by principled search. In 2018 IEEE Symposium on Security and Privacy (SP). 711–725.
[6]
Andrea Fioraldi, Dominik Maier, Heiko Eiß feldt, and Marc Heuse. 2020. AFL++: Combining incremental steps of fuzzing research. In 14th USENIX Workshop on Offensive Technologies (WOOT 20).
[7]
Shuitao Gan, Chao Zhang, Xiaojun Qin, Xuwen Tu, Kang Li, Zhongyu Pei, and Zuoning Chen. 2018. CollAFL: Path Sensitive Fuzzing. In 2018 IEEE Symposium on Security and Privacy (SP). 679–696. issn:2375-1207 https://doi.org/10.1109/SP.2018.00040
[8]
Taras Glek and Jan Hubicka. 2010. Optimizing real world applications with GCC link time optimization. arXiv preprint arXiv:1010.2196.
[9]
Adrian Herrera, Mathias Payer, and Antony L Hosking. 2023. DatAFLow: Toward a data-flow-guided fuzzer. ACM Transactions on Software Engineering and Methodology, 32, 5 (2023), 1–31.
[10]
Chin-Chia Hsu, Che-Yu Wu, Hsu-Chun Hsiao, and Shih-Kun Huang. 2018. Instrim: Lightweight instrumentation for coverage-guided fuzzing. In Symposium on Network and Distributed System Security (NDSS), Workshop on Binary Analysis Research. 40.
[11]
Chris Lattner and Vikram Adve. 2004. LLVM: A compilation framework for lifelong program analysis & transformation. In International symposium on code generation and optimization, 2004. CGO 2004. 75–86.
[12]
Stephan Lipp, Daniel Elsner, Thomas Hutzelmann, Sebastian Banescu, Alexander Pretschner, and Marcel Böhme. 2022. FuzzTastic: A Fine-grained, Fuzzer-agnostic Coverage Analyzer. In 2022 IEEE/ACM 44th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). 75–79. issn:2574-1926 https://doi.org/10.1145/3510454.3516847
[13]
Valentin JM Manès, HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel Egele, Edward J Schwartz, and Maverick Woo. 2019. The art, science, and engineering of fuzzing: A survey. IEEE Transactions on Software Engineering, 47, 11 (2019), 2312–2331.
[14]
Jonathan Metzman, László Szekeres, Laurent Simon, Read Sprabery, and Abhishek Arya. 2021. Fuzzbench: an open fuzzer benchmarking platform and service. In Proceedings of the 29th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering. 1393–1403.
[15]
Stefan Nagy and Matthew Hicks. 2019. Full-Speed Fuzzing: Reducing Fuzzing Overhead through Coverage-Guided Tracing. In 2019 IEEE Symposium on Security and Privacy (SP). 787–802. issn:2375-1207 https://doi.org/10.1109/SP.2019.00069
[16]
Stefan Nagy, Anh Nguyen-Tuong, Jason D. Hiser, Jack W. Davidson, and Matthew Hicks. 2021. Same Coverage, Less Bloat: Accelerating Binary-only Fuzzing with Coverage-preserving Coverage-guided Tracing. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security (CCS ’21). Association for Computing Machinery, New York, NY, USA. 351–365. isbn:978-1-4503-8454-4 https://doi.org/10.1145/3460120.3484787
[17]
Jinghan Wang, Yue Duan, Wei Song, Heng Yin, and Chengyu Song. 2019. Be sensitive and collaborative: Analyzing impact of coverage metrics in greybox fuzzing. In 22nd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2019). 1–15.
[18]
Jinghan Wang, Yue Duan, Wei Song, Heng Yin, and Chengyu Song. 2019. Be Sensitive and Collaborative: Analyzing Impact of Coverage Metrics in Greybox Fuzzing. In 22nd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2019). 1–15. isbn:978-1-939133-07-6 https://www.usenix.org/conference/raid2019/presentation/wang
[19]
Hang Xu, Zhi Yang, Xingyuan Chen, Bing Han, and Xuehui Du. 2023. BitAFL: Provide More Accurate Coverage Information for Coverage-guided Fuzzing. In 3rd International Conference on Management Science and Software Engineering (ICMSSE 2023). 521–530.
[20]
Michal Zalewski. 2014. Technical whitepaper for AFL-fuzz. https://raw.githubusercontent.com/google/AFL/master/docs/technical_details.txt

Index Terms

  1. Understanding and Improving Coverage Tracking with AFL++ (Registered Report)

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    FUZZING 2024: Proceedings of the 3rd ACM International Fuzzing Workshop
    September 2024
    89 pages
    ISBN:9798400711121
    DOI:10.1145/3678722
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 September 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Collisions
    2. Coverage Tracking
    3. Coverage-Guided Fuzzing
    4. Empirical Study
    5. Fuzzing
    6. Hashmap
    7. Hitmap
    8. Imprecision
    9. Overflows
    10. Software Engineering
    11. Testing

    Qualifiers

    • Research-article

    Conference

    FUZZING '24
    Sponsor:

    Upcoming Conference

    ISSTA '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 265
      Total Downloads
    • Downloads (Last 12 months)265
    • Downloads (Last 6 weeks)265
    Reflects downloads up to 19 Oct 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media