Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Potatochka 's Collections
VLM papers
LLM

LLM

updated Mar 9
Upvote
-

  • Attention Heads of Large Language Models: A Survey

    Paper • 2409.03752 • Published Sep 5, 2024 • 90

  • Transformer Explainer: Interactive Learning of Text-Generative Models

    Paper • 2408.04619 • Published Aug 8, 2024 • 162

  • Addition is All You Need for Energy-efficient Language Models

    Paper • 2410.00907 • Published Oct 1, 2024 • 151

  • DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining

    Paper • 2305.10429 • Published May 17, 2023 • 3

  • Phi-4 Technical Report

    Paper • 2412.08905 • Published Dec 12, 2024 • 119

  • Qwen2.5 Technical Report

    Paper • 2412.15115 • Published Dec 19, 2024 • 368

  • DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

    Paper • 2501.12948 • Published Jan 22 • 400

  • Qwen2.5-1M Technical Report

    Paper • 2501.15383 • Published Jan 26 • 71
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs