arxiv:2408.01346

Prompt Refinement or Fine-tuning? Best Practices for using LLMs in Computational Social Science Tasks

Published on Aug 2, 2024

Authors:

Abstract

Modern large language models perform effectively across various social knowledge tasks, with best practices including using larger vocabularies, AI-enhanced prompting, and fine-tuning on task-specific data.

AI-generated summary

Large Language Models are expressive tools that enable complex tasks of text understanding within Computational Social Science. Their versatility, while beneficial, poses a barrier for establishing standardized best practices within the field. To bring clarity on the values of different strategies, we present an overview of the performance of modern LLM-based classification methods on a benchmark of 23 social knowledge tasks. Our results point to three best practices: select models with larger vocabulary and pre-training corpora; avoid simple zero-shot in favor of AI-enhanced prompting; fine-tune on task-specific data, and consider more complex forms instruction-tuning on multiple datasets only when only training data is more abundant.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2408.01346 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2408.01346 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2408.01346 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.