-
A Context-Driven Approach for Co-Auditing Smart Contracts with The Support of GPT-4 code interpreter
Authors:
Mohamed Salah Bouafif,
Chen Zheng,
Ilham Ahmed Qasse,
Ed Zulkoski,
Mohammad Hamdaqa,
Foutse Khomh
Abstract:
The surge in the adoption of smart contracts necessitates rigorous auditing to ensure their security and reliability. Manual auditing, although comprehensive, is time-consuming and heavily reliant on the auditor's expertise. With the rise of Large Language Models (LLMs), there is growing interest in leveraging them to assist auditors in the auditing process (co-auditing). However, the effectivenes…
▽ More
The surge in the adoption of smart contracts necessitates rigorous auditing to ensure their security and reliability. Manual auditing, although comprehensive, is time-consuming and heavily reliant on the auditor's expertise. With the rise of Large Language Models (LLMs), there is growing interest in leveraging them to assist auditors in the auditing process (co-auditing). However, the effectiveness of LLMs in smart contract co-auditing is contingent upon the design of the input prompts, especially in terms of context description and code length. This paper introduces a novel context-driven prompting technique for smart contract co-auditing. Our approach employs three techniques for context scoping and augmentation, encompassing code scoping to chunk long code into self-contained code segments based on code inter-dependencies, assessment scoping to enhance context description based on the target assessment goal, thereby limiting the search space, and reporting scoping to force a specific format for the generated response. Through empirical evaluations on publicly available vulnerable contracts, our method demonstrated a detection rate of 96\% for vulnerable functions, outperforming the native prompting approach, which detected only 53\%. To assess the reliability of our prompting approach, manual analysis of the results was conducted by expert auditors from our partner, Quantstamp, a world-leading smart contract auditing company. The experts' analysis indicates that, in unlabeled datasets, our proposed approach enhances the proficiency of the GPT-4 code interpreter in detecting vulnerabilities.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Relating Complexity-theoretic Parameters with SAT Solver Performance
Authors:
Edward Zulkoski,
Ruben Martins,
Christoph Wintersteiger,
Robert Robere,
Jia Liang,
Krzysztof Czarnecki,
Vijay Ganesh
Abstract:
Over the years complexity theorists have proposed many structural parameters to explain the surprising efficiency of conflict-driven clause-learning (CDCL) SAT solvers on a wide variety of large industrial Boolean instances. While some of these parameters have been studied empirically, until now there has not been a unified comparative study of their explanatory power on a comprehensive benchmark.…
▽ More
Over the years complexity theorists have proposed many structural parameters to explain the surprising efficiency of conflict-driven clause-learning (CDCL) SAT solvers on a wide variety of large industrial Boolean instances. While some of these parameters have been studied empirically, until now there has not been a unified comparative study of their explanatory power on a comprehensive benchmark. We correct this state of affairs by conducting a large-scale empirical evaluation of CDCL SAT solver performance on nearly 7000 industrial and crafted formulas against several structural parameters such as backdoors, treewidth, backbones, and community structure.
Our study led us to several results. First, we show that while such parameters only weakly correlate with CDCL solving time, certain combinations of them yield much better regression models. Second, we show how some parameters can be used as a "lens" to better understand the efficiency of different solving heuristics. Finally, we propose a new complexity-theoretic parameter, which we call learning-sensitive with restarts (LSR) backdoors, that extends the notion of learning-sensitive (LS) backdoors to incorporate restarts and discuss algorithms to compute them. We mathematically prove that for certain class of instances minimal LSR-backdoors are exponentially smaller than minimal-LS backdoors.
△ Less
Submitted 26 June, 2017;
originally announced June 2017.
-
Understanding VSIDS Branching Heuristics in Conflict-Driven Clause-Learning SAT Solvers
Authors:
Jia Hui Liang,
Vijay Ganesh,
Ed Zulkoski,
Atulan Zaman,
Krzysztof Czarnecki
Abstract:
Conflict-Driven Clause-Learning SAT solvers crucially depend on the Variable State Independent Decaying Sum (VSIDS) branching heuristic for their performance. Although VSIDS was proposed nearly fifteen years ago, and many other branching heuristics for SAT solving have since been proposed, VSIDS remains one of the most effective branching heuristics.
In this paper, we advance our understanding o…
▽ More
Conflict-Driven Clause-Learning SAT solvers crucially depend on the Variable State Independent Decaying Sum (VSIDS) branching heuristic for their performance. Although VSIDS was proposed nearly fifteen years ago, and many other branching heuristics for SAT solving have since been proposed, VSIDS remains one of the most effective branching heuristics.
In this paper, we advance our understanding of VSIDS by answering the following key questions. The first question we pose is "what is special about the class of variables that VSIDS chooses to additively bump?" In answering this question we showed that VSIDS overwhelmingly picks, bumps, and learns bridge variables, defined as the variables that connect distinct communities in the community structure of SAT instances. This is surprising since VSIDS was invented more than a decade before the link between community structure and SAT solver performance was discovered. Additionally, we show that VSIDS viewed as a ranking function correlates strongly with temporal graph centrality measures. Putting these two findings together, we conclude that VSIDS picks high-centrality bridge variables. The second question we pose is "what role does multiplicative decay play in making VSIDS so effective?" We show that the multiplicative decay behaves like an exponential moving average (EMA) that favors variables that persistently occur in conflicts (the signal) over variables that occur intermittently (the noise). The third question we pose is "whether VSIDS is temporally and spatially focused." We show that VSIDS disproportionately picks variables from a few communities unlike, say, the random branching heuristic. We put these findings together to invent a new adaptive VSIDS branching heuristic that solves more instances than one of the best-known VSIDS variants over the SAT Competition 2013 benchmarks.
△ Less
Submitted 14 September, 2015; v1 submitted 29 June, 2015;
originally announced June 2015.