arxiv:2408.04114

Zero-shot Factual Consistency Evaluation Across Domains

Published on Aug 7, 2024

Authors:

Abstract

The work unifies several natural language tasks to train models that evaluate factual consistency across diverse domains, achieving state-of-the-art performance while ensuring efficiency and generalization.

AI-generated summary

This work addresses the challenge of factual consistency in text generation systems. We unify the tasks of Natural Language Inference, Summarization Evaluation, Factuality Verification and Factual Consistency Evaluation to train models capable of evaluating the factual consistency of source-target pairs across diverse domains. We rigorously evaluate these against eight baselines on a comprehensive benchmark suite comprising 22 datasets that span various tasks, domains, and document lengths. Results demonstrate that our method achieves state-of-the-art performance on this heterogeneous benchmark while addressing efficiency concerns and attaining cross-domain generalization.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 3

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2408.04114 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2408.04114 in a Space README.md to link it from this page.