SYNTHEVAL: Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists.

AllImages Books Videos Maps News Shopping

Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists

Aug 30, 2024 � We propose SYNTHEVAL, a hybrid behavioral testing framework that leverages large language models (LLMs) to generate a wide range of test types for a�...

Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists

www.researchgate.net › publication › 383648534_SYNTHEVAL_Hybrid_...

Sep 7, 2024 � SYNTHEVAL first generates sentences via LLMs using controlled generation, and then identifies challenging examples by comparing the predictions�...

SYNTHEVAL: Hybrid Behavioral Testing of NLP Models with ...

www.aimodels.fyi › papers › arxiv › syntheval-hybrid-behavioral-testing-n...

Sep 1, 2024 � Overview � Traditional NLP model evaluation relies on static test sets, which can overestimate performance and lack comprehensive assessment.

[PDF] arXiv:2408.17437v1 [cs.CL] 30 Aug 2024

arxiv.org › pdf

Aug 30, 2024 � SYNTHEVAL: Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists. Warning: This paper contains language that readers may find�...

Loreley99/SynthEval_CheckList - GitHub

github.com › Loreley99 › SynthEval_CheckList

Contribute to Loreley99/SynthEval_CheckList development by creating an account on GitHub.

Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists ...

jglobal.jst.go.jp › detail

Article "SYNTHEVAL: Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists" Detailed information of the J-GLOBAL is an information service�...

SYNTHEVAL: Hybrid Behavioral Testing of NLP Models ... - ChatPaper

chatpaper.com › chatpaper › zh-CN › paper

Sep 1, 2024 � 在本研究中，我们提出了SYNTHEVAL，这是一种混合行为测试框架，利用大型语言模型（LLMs）生成广泛的测试类型，以全面评估NLP模型。SYNTHEVAL首先通过LLMs使用受�...

Beyond Accuracy: Behavioral Testing of NLP Models with CheckList

www.semanticscholar.org › paper › Beyond-Accuracy:-Behavioral-Testing...

May 8, 2020 � This work proposes SYNTHEVAL, a hybrid behavioral testing framework that leverages large language models (LLMs) to generate a wide range of�...

Anna Korhonen - CatalyzeX

www.catalyzex.com › author

SYNTHEVAL first generates sentences via LLMs using controlled generation, and then identifies challenging examples by comparing the predictions made by LLMs�...

Leonie Weissweiler | Papers With Code

paperswithcode.com › author › leonie-weissweiler

In this work, we propose SYNTHEVAL, a hybrid behavioral testing framework that leverages large language models (LLMs) to generate a wide range of test types for�...