view article Article From Files to Chunks: Improving Hugging Face Storage Efficiency By jsulz and 1 other • Nov 20, 2024 • 61
Constructing and Expanding Low-Resource and Underrepresented Parallel Datasets for Indonesian Local Languages Paper • 2404.01009 • Published Apr 1, 2024 • 1
Exploring the Latent Capacity of LLMs for One-Step Text Generation Paper • 2505.21189 • Published 10 days ago • 59
LLaDA-V: Large Language Diffusion Models with Visual Instruction Tuning Paper • 2505.16933 • Published 15 days ago • 30
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning Paper • 2505.15966 • Published 16 days ago • 51
Large Language Models Implicitly Learn to See and Hear Just By Reading Paper • 2505.17091 • Published 17 days ago • 5
TabSTAR: A Foundation Tabular Model With Semantically Target-Aware Representations Paper • 2505.18125 • Published 14 days ago • 109
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning Paper • 2505.17667 • Published 14 days ago • 85
Joint MoE Scaling Laws: Mixture of Experts Can Be Memory Efficient Paper • 2502.05172 • Published Feb 7 • 2
Efficiently Editing Mixture-of-Experts Models with Compressed Experts Paper • 2503.00634 • Published Mar 1 • 2
Dense Training, Sparse Inference: Rethinking Training of Mixture-of-Experts Language Models Paper • 2404.05567 • Published Apr 8, 2024 • 10
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 168
OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning Paper • 2505.04601 • Published 30 days ago • 26