|
<!DOCTYPE html> |
|
<html lang="en"> |
|
<head> |
|
<meta charset="utf-8" /> |
|
<meta name="viewport" content="width=device-width, initial-scale=1" /> |
|
<meta name="description" content="Beyond ‘Aha!’ — Systematic Meta‑Ability Alignment in Large Reasoning Models presents a three‑stage recipe that explicitly teaches deduction, induction, and abduction, achieving state‑of‑the‑art reasoning performance." /> |
|
<meta name="keywords" content="Meta‑Abilities, Deduction, Induction, Abduction, Reinforcement Learning, Large Reasoning Models" /> |
|
<title>Beyond “Aha!” — Meta‑Ability Alignment for Reasoning Models</title> |
|
|
|
<link href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro" rel="stylesheet" /> |
|
<link rel="stylesheet" href="./static/css/bulma.min.css" /> |
|
<link rel="stylesheet" href="./static/css/fontawesome.all.min.css" /> |
|
<link rel="stylesheet" href="./static/css/index.css" /> |
|
<link rel="icon" href="./static/images/favicon.svg" /> |
|
|
|
<script defer src="./static/js/fontawesome.all.min.js"></script> |
|
</head> |
|
<body> |
|
|
|
<section class="hero"> |
|
<div class="hero-body"> |
|
<div class="container is-max-desktop"> |
|
<div class="columns is-centered"> |
|
<div class="column has-text-centered"> |
|
<h1 class="title is-1 publication-title">Beyond “Aha!”: Systematic Meta‑Ability Alignment in Large Reasoning Models</h1> |
|
<div class="is-size-5 publication-authors"> |
|
<span class="author-block"><a href="https://zhiyuanhubj.github.io/" target="_blank">Zhiyuan Hu</a><sup>1</sup>,</span> |
|
<span class="author-block"><a href="#" target="_blank">Yibo Wang</a><sup>2</sup>,</span> |
|
<span class="author-block"><a href="https://hendrydong.github.io/" target="_blank">Hanze Dong</a><sup>3</sup>,</span> |
|
<span class="author-block"><a href="#" target="_blank">Yuhui Xu</a><sup>3</sup>,</span> |
|
<span class="author-block"><a href="#" target="_blank"><strong>Amrita Saha</strong></a><sup>3</sup>,</span> |
|
<span class="author-block"><a href="http://cmxiong.com/" target="_blank"><strong>Caiming Xiong</strong></a><sup>3</sup>,</span> |
|
<span class="author-block"><a href="https://bhooi.github.io/" target="_blank"><strong>Bryan Hooi</strong></a><sup>1</sup>,</span> |
|
<span class="author-block"><a href="https://scholar.google.com/citations?user=MuUhwi0AAAAJ&hl=en" target="_blank"><strong>Junnan Li</strong></a><sup>3</sup></span> |
|
</div> |
|
|
|
<div class="is-size-5 publication-authors"> |
|
<span class="author-block"><sup>1</sup>National University of Singapore,</span> |
|
<span class="author-block"><sup>2</sup>Tsinghua University,</span> |
|
<span class="author-block"><sup>3</sup>Salesforce AI Research</span> |
|
</div> |
|
|
|
|
|
<div class="column has-text-centered"> |
|
<div class="publication-links"> |
|
<span class="link-block"> |
|
<a href="https://github.com/zhiyuanhubj/Meta-Ability-Alignment/blob/main/Paper.pdf" target="_blank" class="external-link button is-normal is-rounded is-dark"> |
|
<span class="icon"><i class="fas fa-file-pdf"></i></span> |
|
<span>Paper</span> |
|
</a> |
|
</span> |
|
<span class="link-block"> |
|
<a href="https://github.com/zhiyuanhubj/Meta-Ability-Alignment/blob/main/Paper.pdf" target="_blank" class="external-link button is-normal is-rounded is-dark"> |
|
<span class="icon"><i class="ai ai-arxiv"></i></span> |
|
<span>arXiv</span> |
|
</a> |
|
</span> |
|
<span class="link-block"> |
|
<a href="https://github.com/zhiyuanhubj/Meta-Ability-Alignment" target="_blank" class="external-link button is-normal is-rounded is-dark"> |
|
<span class="icon"><i class="fab fa-github"></i></span> |
|
<span>Code</span> |
|
</a> |
|
</span> |
|
<span class="link-block"> |
|
<a href="https://x.com/ZhiyuanCS/status/1922734609634296004" target="_blank" class="external-link button is-normal is-rounded is-dark"> |
|
<span class="icon"><i class="far fa-images"></i></span> |
|
<span>Twitter (X)</span> |
|
</a> |
|
</span> |
|
</div> |
|
</div> |
|
</div> |
|
</div> |
|
</div> |
|
</div> |
|
</section> |
|
|
|
|
|
<section class="section"> |
|
<div class="container is-max-desktop"> |
|
<div class="columns is-centered has-text-centered"> |
|
<div class="column is-four-fifths"> |
|
<h2 class="title is-3">Abstract</h2> |
|
<div class="content has-text-justified"> |
|
<p>Large reasoning models (LRMs) possess a latent capacity for long chain‑of‑thought reasoning, but the timing and consistency of emergent “aha” behaviors remain unpredictable. We explicitly align LRMs with three meta‑abilities—<strong>deduction, induction, and abduction</strong>—using automatically generated, self‑verifiable tasks. Our three‑stage pipeline (individual alignment, parameter‑space merging, and domain‑specific reinforcement learning) lifts performance ceilings by ≤10 % over instruction‑tuned baselines and delivers state‑of‑the‑art accuracy across math, coding, and science benchmarks.</p> |
|
</div> |
|
</div> |
|
</div> |
|
</div> |
|
</section> |
|
|
|
|
|
<section class="section is-light"> |
|
<div class="container is-max-desktop"> |
|
<h2 class="title is-3 has-text-centered">Three‑Stage Training Framework</h2> |
|
<figure class="image"> |
|
<img src="./static/images/framework.png" alt="Three‑stage meta‑ability alignment framework diagram." /> |
|
<figcaption class="has-text-centered">Stage A: Meta‑ability alignment ⟶ Stage B: Parameter‑space merging ⟶ Stage C: Domain‑specific RL.</figcaption> |
|
</figure> |
|
<br /> |
|
<h2 class="title is-3 has-text-centered">Key Results</h2> |
|
<figure class="image"> |
|
<img src="./static/images/results.png" alt="Performance tables showing consistent gains from meta‑ability alignment." /> |
|
<figcaption class="has-text-centered">Table 1 & 2: Meta‑ability alignment boosts reasoning performance at both 7B and 32B scales.</figcaption> |
|
</figure> |
|
</div> |
|
</section> |
|
|
|
|
|
|
|
<section class="section" id="BibTeX"> |
|
<div class="container is-max-desktop content"> |
|
<h2 class="title">BibTeX</h2> |
|
<pre><code>@article{hu2025metaability, |
|
author = {Hu, Zhiyuan and Wang, Yibo and Dong, Hanze and Xu, Yuhui and Saha, Amrita and Xiong, Caiming and Hooi, Bryan and Li, Junnan}, |
|
title = {Beyond “Aha!”: Systematic Meta‑Ability Alignment in Large Reasoning Models}, |
|
journal = {Arxiv}, |
|
year = {2025} |
|
}</code></pre> |
|
</div> |
|
</section> |
|
|
|
<footer class="footer"> |
|
<div class="container"> |
|
<div class="content has-text-centered"> |
|
<a class="icon-link" target="_blank" href="https://github.com/zhiyuanhubj/Meta-Ability-Alignment/blob/main/Paper.pdf"><i class="fas fa-file-pdf"></i></a> |
|
<a class="icon-link" target="_blank" href="https://github.com/your‑repo"><i class="fab fa-github"></i></a> |
|
</div> |
|
<div class="columns is-centered"> |
|
<div class="column is-8"> |
|
<div class="content"> |
|
<p>This website is licensed under a <a rel="license" target="_blank" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution‑ShareAlike 4.0 International License</a>.</p> |
|
<p>You are free to reuse the <a target="_blank" href="https://github.com/nerfies/nerfies.github.io">source code</a>; please include a link back in the footer.</p> |
|
</div> |
|
</div> |
|
</div> |
|
</div> |
|
</footer> |
|
</body> |
|
</html> |
|
|