Google Scholar

Are Large Language Models Reliable Judges? A Study on the Factuality Evaluation Capabilities of LLMs

XY Fu, MTR Laskar, C Chen, SB TN�- arXiv preprint arXiv:2311.00681, 2023 - arxiv.org

In recent years, Large Language Models (LLMs) have gained immense attention due to their
notable emergent capabilities, surpassing those seen in earlier language models. A
particularly intriguing application of LLMs is their role as evaluators for texts produced by
various generative models. In this study, we delve into the potential of LLMs as reliable
assessors of factual consistency in summaries generated by text-generation models. Initially,
we introduce an innovative approach for factuality assessment using LLMs. This entails�…

Save Cite Cited by 2 Related articles All 3 versions View as HTML

[PDF] aclanthology.org

Are Large Language Models Reliable Judges? A Study on the Factuality Evaluation Capabilities of LLMs

MTR Laskar, C Chen, SB Tn�- …�of the Third Workshop on Natural�…, 2023 - aclanthology.org

In recent years, large language models (LLMs) have drawn significant attention due to their
impressive emergent capabilities that were not observed in earlier language models. One
emerging area where LLMs have been widely used in recent times is the utilization of LLMs
as the evaluator of the texts generated by various generative models. In this paper, we also
explore the possibility of whether LLMs are reliable in assessing the factual consistency of
summaries generated by text generation models. We first propose a new approach to�…

Save Cite Cited by 1 Related articles View as HTML

Showing the best results for this search. See all results

Cite

Advanced search

Saved to My library

Are Large Language Models Reliable Judges? A Study on the Factuality Evaluation Capabilities of LLMs

Are Large Language Models Reliable Judges? A Study on the Factuality Evaluation Capabilities of LLMs