arxiv:1803.06535

Dear Sir or Madam, May I introduce the GYAFC Dataset: Corpus, Benchmarks and Metrics for Formality Style Transfer

Published on Mar 17, 2018

Authors:

Abstract

A large corpus for formality style transfer is created, demonstrating machine translation techniques as strong baselines, while discussing challenges with automatic evaluation metrics.

AI-generated summary

Style transfer is the task of automatically transforming a piece of text in one particular style into another. A major barrier to progress in this field has been a lack of training and evaluation datasets, as well as benchmarks and automatic metrics. In this work, we create the largest corpus for a particular stylistic transfer (formality) and show that techniques from the machine translation community can serve as strong baselines for future work. We also discuss challenges of using automatic metrics.

View arXiv page View PDF Add to collection