arxiv:2505.03793

LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection

Published on May 1

Authors:

Xinyue Zeng ,

Abstract

A theoretical framework and LENSLLM model predict and select Large Language Models efficiently by modeling fine-tuning dynamics and computational efficiency.

AI-generated summary

The proliferation of open-sourced Large Language Models (LLMs) and diverse downstream tasks necessitates efficient model selection, given the impracticality of fine-tuning all candidates due to computational constraints. Despite the recent advances in LLM selection, a fundamental research question largely remains nascent: how can we model the dynamic behaviors of LLMs during fine-tuning, thereby enhancing our understanding of their generalization performance across diverse downstream tasks? In this work, we propose a novel theoretical framework that provides a proper lens to assess the generalization capabilities of LLMs, thereby enabling accurate and efficient LLM selection for downstream applications. In particular, we first derive a Hessian-based PAC-Bayes generalization bound that unveils fine-tuning dynamics of LLMs and then introduce LENSLLM, a Neural Tangent Kernel(NTK)-based Rectified Scaling Model that enables accurate performance predictions across diverse tasks while maintaining computational efficiency. Extensive empirical results on 3 large-scale benchmarks demonstrate that our model achieves up to 91.1% accuracy and reduces up to 88.5% computational cost in LLM selection, outperforming 5 state-of-the-art methods. We open-source our proposed LENSLLM model and corresponding results at the Github link: https://github.com/Susan571/LENSLLM.git.

View arXiv page View PDF GitHub repository Add to collection

Community

xyzeng2000

Paper author 3 days ago

•

edited 3 days ago

🚀 Big News! Our research, LensLLM, has been accepted at ICML 2025! We're excited to share a new way to select the best Large Language Models (LLMs) more intelligently, backed by a novel theoretical framework and open-source code.

The challenge we tackle in our paper, "LENSLLM: Unveiling Fine-Tuning Dynamics for LLM Selection," is a common one: How can we predict an LLM's performance after fine-tuning, without the time and cost of actually fine-tuning every option?

🔗 Explore the full research and try LENSLLM yourself:

GitHub (Code & Results): https://github.com/Susan571/LENSLLM.git
📄 Read the Paper (arXiv): https://arxiv.org/abs/2505.03793

What's New? Our Core Theoretical Insight:
We've developed a Hessian-based PAC-Bayes generalization bound. Think of this as a new mathematical "lens" that lets us:

See and understand the complex changes (the "fine-tuning dynamics") LLMs go through during training.
Gain fundamental insights into learning phenomena, like "phase transitions."
This deeper understanding powers our LENSLLM model (a Neural Tangent Kernel-based approach), allowing it to accurately estimate an LLM's potential before you commit to extensive fine-tuning.

Clear Benefits & Proven Results:
This strong theoretical foundation translates directly to practical advantages:
✅ More Accurate Selection: Up to 91.1% accuracy in picking the best LLM for a task.
✅ Massive Cost Savings: Reduces computational needs by up to 88.5%.
✅ State-of-the-Art Performance: Outperforms 5 leading methods on 3 major benchmarks.

We believe LENSLLM is a significant step towards making LLM development more efficient and scientifically grounded. We're honored by the ICML acceptance and thrilled to contribute these insights and tools to the AI community.

We encourage you to check out LENSLLM and look forward to discussing it further at ICML 2025!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2505.03793 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2505.03793 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2505.03793 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.