Abstract
The study evaluates the ability of artificial neural networks to assess grammatical acceptability, using the Corpus of Linguistic Acceptability (CoLA) dataset, and finds that while models perform better than unsupervised baselines, they fall short of human performance across various grammatical constructions.
This paper investigates the ability of artificial neural networks to judge the grammatical acceptability of a sentence, with the goal of testing their linguistic competence. We introduce the Corpus of Linguistic Acceptability (CoLA), a set of 10,657 English sentences labeled as grammatical or ungrammatical from published linguistics literature. As baselines, we train several recurrent neural network models on acceptability classification, and find that our models outperform unsupervised models by Lau et al (2016) on CoLA. Error-analysis on specific grammatical phenomena reveals that both Lau et al.'s models and ours learn systematic generalizations like subject-verb-object order. However, all models we test perform far below human level on a wide range of grammatical constructions.
Models citing this paper 22
Browse 22 models citing this paperDatasets citing this paper 6
Browse 6 datasets citing this paperSpaces citing this paper 1,554
Collections including this paper 0
No Collection including this paper