Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness Paper • 2505.22960 • Published May 29 • 16