arxiv:2411.14877

Astro-HEP-BERT: A bidirectional language model for studying the meanings of concepts in astrophysics and high energy physics

Published on Nov 22, 2024

Authors:

Arno Simons

Abstract

Astro-HEP-BERT, a transformer-based model trained on a largeAstro-HEP Corpus, provides contextualized word embeddings for astrophysics and high-energy physics research, demonstrating effectiveness in semantic analyses with comparable performance to domain-adapted models.

AI-generated summary

I present Astro-HEP-BERT, a transformer-based language model specifically designed for generating contextualized word embeddings (CWEs) to study the meanings of concepts in astrophysics and high-energy physics. Built on a general pretrained BERT model, Astro-HEP-BERT underwent further training over three epochs using the Astro-HEP Corpus, a dataset I curated from 21.84 million paragraphs extracted from more than 600,000 scholarly articles on arXiv, all belonging to at least one of these two scientific domains. The project demonstrates both the effectiveness and feasibility of adapting a bidirectional transformer for applications in the history, philosophy, and sociology of science (HPSS). The entire training process was conducted using freely available code, pretrained weights, and text inputs, completed on a single MacBook Pro Laptop (M2/96GB). Preliminary evaluations indicate that Astro-HEP-BERT's CWEs perform comparably to domain-adapted BERT models trained from scratch on larger datasets for domain-specific word sense disambiguation and induction and related semantic change analyses. This suggests that retraining general language models for specific scientific domains can be a cost-effective and efficient strategy for HPSS researchers, enabling high performance without the need for extensive training from scratch.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 1

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2411.14877 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.