arxiv:2110.01518

Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics

Published on Oct 4, 2021

Authors:

Aleksandr Drozd ,

Abstract

Several strategies, including adapter and debiasing techniques, were tested for their effectiveness in improving the generalization of BERT-based models across different datasets in natural language inference tasks.

AI-generated summary

Much of recent progress in NLU was shown to be due to models' learning dataset-specific heuristics. We conduct a case study of generalization in NLI (from MNLI to the adversarially constructed HANS dataset) in a range of BERT-based architectures (adapters, Siamese Transformers, HEX debiasing), as well as with subsampling the data and increasing the model size. We report 2 successful and 3 unsuccessful strategies, all providing insights into how Transformer-based models learn to generalize.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 19

Browse 19 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2110.01518 in a dataset README.md to link it from this page.

Spaces citing this paper 13

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.