Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published Jun 8 • 112
MedBrowseComp: Benchmarking Medical Deep Research and Computer Use Paper • 2505.14963 • Published May 20 • 2
Measuring the Faithfulness of Thinking Drafts in Large Reasoning Models Paper • 2505.13774 • Published May 19 • 1
When Models Reason in Your Language: Controlling Thinking Trace Language Comes at the Cost of Accuracy Paper • 2505.22888 • Published May 28 • 6
Measuring the Faithfulness of Thinking Drafts in Large Reasoning Models Paper • 2505.13774 • Published May 19 • 1
MedBrowseComp: Benchmarking Medical Deep Research and Computer Use Paper • 2505.14963 • Published May 20 • 2
When Models Reason in Your Language: Controlling Thinking Trace Language Comes at the Cost of Accuracy Paper • 2505.22888 • Published May 28 • 6