Are Large Language Models Reliable Judges? A Study on the Factuality Evaluation Capabilities of LLMs

XY Fu, MTR Laskar, C Chen, SB TN�- arXiv preprint arXiv:2311.00681, 2023 - arxiv.org
In recent years, Large Language Models (LLMs) have gained immense attention due to their
notable emergent capabilities, surpassing those seen in earlier language models. A
particularly intriguing application of LLMs is their role as evaluators for texts produced by
various generative models. In this study, we delve into the potential of LLMs as reliable
assessors of factual consistency in summaries generated by text-generation models. Initially,
we introduce an innovative approach for factuality assessment using LLMs. This entails�…

Are Large Language Models Reliable Judges? A Study on the Factuality Evaluation Capabilities of LLMs

MTR Laskar, C Chen, SB Tn�- …�of the Third Workshop on Natural�…, 2023 - aclanthology.org
In recent years, large language models (LLMs) have drawn significant attention due to their
impressive emergent capabilities that were not observed in earlier language models. One
emerging area where LLMs have been widely used in recent times is the utilization of LLMs
as the evaluator of the texts generated by various generative models. In this paper, we also
explore the possibility of whether LLMs are reliable in assessing the factual consistency of
summaries generated by text generation models. We first propose a new approach to�…
Showing the best results for this search. See all results