๐ง We just implemented Andrej Karpathy's "third paradigm" for LLM learning!
System Prompt Learning (SPL) enables LLMs to automatically learn problem-solving strategies from experience, rather than relying on static prompts.
๐ How it works: Your LLM builds a database of effective strategies, selects the best ones for each problem, and refines them over time based on success rates.
The best part? All strategies are human-readable and the system gets progressively better at problem types you use frequently.
โจ Key benefits: ๐ Cumulative learning over time ๐ Transparent, inspectable strategies ๐ Works with any OpenAI-compatible API โก Simple integration: just add "spl-" prefix to your model
Built as an open-source plugin in optillm. After 500 queries, our system developed 129 strategies and refined 97 of them!
This feels like a genuine step toward AI that learns from experience while staying completely interpretable.
๐ง We just implemented Andrej Karpathy's "third paradigm" for LLM learning!
System Prompt Learning (SPL) enables LLMs to automatically learn problem-solving strategies from experience, rather than relying on static prompts.
๐ How it works: Your LLM builds a database of effective strategies, selects the best ones for each problem, and refines them over time based on success rates.
The best part? All strategies are human-readable and the system gets progressively better at problem types you use frequently.
โจ Key benefits: ๐ Cumulative learning over time ๐ Transparent, inspectable strategies ๐ Works with any OpenAI-compatible API โก Simple integration: just add "spl-" prefix to your model
Built as an open-source plugin in optillm. After 500 queries, our system developed 129 strategies and refined 97 of them!
This feels like a genuine step toward AI that learns from experience while staying completely interpretable.
Introducing AutoThink: Adaptive reasoning for LLMs that improves performance by 43% on reasoning benchmarks!
Instead of using fixed thinking budgets, AutoThink: - Classifies query complexity (HIGH/LOW) using adaptive classification - Dynamically allocates thinking tokens based on complexity - Uses steering vectors derived from Pivotal Token Search to guide reasoning patterns
Results on DeepSeek-R1-Distill-Qwen-1.5B: - GPQA-Diamond: 31.06% vs 21.72% baseline (+9.34 points) - MMLU-Pro: 26.38% vs 25.58% baseline (+0.8 points) - Uses fewer tokens than baseline approaches
Works with any local reasoning model - DeepSeek, Qwen, Llama, custom models. The technique combines our research on Pivotal Token Search (PTS) implementation and adaptive classification frameworks.