Computer Science > Software Engineering

arXiv:2402.11910 (cs)

[Submitted on 19 Feb 2024]

Title:Enhancing Large Language Models for Text-to-Testcase Generation

Authors:Saranya Alagarsamy, Chakkrit Tantithamthavorn, Chetan Arora, Aldeida Aleti

Abstract:Context: Test-driven development (TDD) is a widely employed software development practice that involves developing test cases based on requirements prior to writing the code. Although various methods for automated test case generation have been proposed, they are not specifically tailored for TDD, where requirements instead of code serve as input. Objective: In this paper, we introduce a text-to-testcase generation approach based on a large language model (GPT-3.5) that is fine-tuned on our curated dataset with an effective prompt design. Method: Our approach involves enhancing the capabilities of basic GPT-3.5 for text-to-testcase generation task that is fine-tuned on our curated dataset with an effective prompting design. We evaluated the effectiveness of our approach using a span of five large-scale open-source software projects. Results: Our approach generated 7k test cases for open source projects, achieving 78.5% syntactic correctness, 67.09% requirement alignment, and 61.7% code coverage, which substantially outperforms all other LLMs (basic GPT-3.5, Bloom, and CodeT5). In addition, our ablation study demonstrates the substantial performance improvement of the fine-tuning and prompting components of the GPT-3.5 model. Conclusions: These findings lead us to conclude that fine-tuning and prompting should be considered in the future when building a language model for the text-to-testcase generation task

Subjects:	Software Engineering (cs.SE)
Cite as:	arXiv:2402.11910 [cs.SE]
	(or arXiv:2402.11910v1 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.2402.11910

Submission history

From: Saranya Alagarsamy [view email]
[v1] Mon, 19 Feb 2024 07:50:54 UTC (1,194 KB)

Computer Science > Software Engineering

Title:Enhancing Large Language Models for Text-to-Testcase Generation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:Enhancing Large Language Models for Text-to-Testcase Generation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators