Selection Induced Collider Bias: A Gender Pronoun Uncertainty Case Study
Abstract
The paper presents a method to measure and exploit spurious gender associations in language models to classify prediction task uncertainty with high accuracy.
In this paper, we cast the problem of task underspecification in causal terms, and develop a method for empirical measurement of spurious associations between gender and gender-neutral entities for unmodified large language models, detecting previously unreported spurious correlations. We then describe a lightweight method to exploit the resulting spurious associations for prediction task uncertainty classification, achieving over 90% accuracy on a Winogender Schemas challenge set. Finally, we generalize our approach to address a wider range of prediction tasks and provide open-source demos for each method described here.
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 2
Collections including this paper 0
No Collection including this paper