SPIN-Bench: How Well Do LLMs Plan Strategically and Reason Socially? Paper β’ 2503.12349 β’ Published Mar 16 β’ 43
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens Paper β’ 2508.01191 β’ Published 18 days ago β’ 215
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs Paper β’ 2307.16789 β’ Published Jul 31, 2023 β’ 101
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning Paper β’ 2506.24119 β’ Published Jun 30 β’ 48
view article Article Building the Hugging Face MCP Server By evalstate and 3 others β’ Jul 10 β’ 60
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling β’ 3 items β’ Updated Dec 19, 2024 β’ 149
view article Article Finally, a Replacement for BERT: Introducing ModernBERT By bclavie and 14 others β’ Dec 19, 2024 β’ 679
Generating Physically Stable and Buildable LEGO Designs from Text Paper β’ 2505.05469 β’ Published May 8 β’ 28
view article Article Open-source DeepResearch β Freeing our search agents By m-ric and 4 others β’ Feb 4 β’ 1.28k