The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text Paper • 2506.05209 • Published 1 day ago • 16
Static Word Embeddings for Sentence Semantic Representation Paper • 2506.04624 • Published 1 day ago • 2
Common Corpus: The Largest Collection of Ethical Data for LLM Pre-Training Paper • 2506.01732 • Published 4 days ago • 1
XToM: Exploring the Multilingual Theory of Mind for Large Language Models Paper • 2506.02461 • Published 3 days ago • 1
view article Article No GPU left behind: Unlocking Efficiency with Co-located vLLM in TRL By toslali-ibm and 5 others • 4 days ago • 37
EmoBench-UA: A Benchmark Dataset for Emotion Detection in Ukrainian Paper • 2505.23297 • Published 8 days ago • 1
LLM in the Loop: Creating the PARADEHATE Dataset for Hate Speech Detoxification Paper • 2506.01484 • Published 4 days ago • 4
Novel Benchmark for NER in the Wastewater and Stormwater Domain Paper • 2506.01938 • Published 4 days ago • 1
Common Pile v0.1 Collection All resources related to Common Pile v0.1, an 8TB dataset of public domain and openly licensed text • 4 items • Updated about 11 hours ago • 10
ModernGBERT: German-only 1B Encoder Model Trained from Scratch Paper • 2505.13136 • Published 18 days ago • 21
Understanding Gated Neurons in Transformers from Their Input-Output Functionality Paper • 2505.17936 • Published 14 days ago • 1
Language Mixing in Reasoning Language Models: Patterns, Impact, and Internal Causes Paper • 2505.14815 • Published 17 days ago • 1
Mechanistic Understanding and Mitigation of Language Confusion in English-Centric Large Language Models Paper • 2505.16538 • Published 15 days ago • 2
Tracing Multilingual Factual Knowledge Acquisition in Pretraining Paper • 2505.14824 • Published 17 days ago • 4
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 168