SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents Paper • 2505.20411 • Published 11 days ago • 84
view article Article The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare By aaditya and 2 others • Apr 19, 2024 • 163
HeadGAP: Few-shot 3D Head Avatar via Generalizable Gaussian Priors Paper • 2408.06019 • Published Aug 12, 2024 • 15
ControlNeXt: Powerful and Efficient Control for Image and Video Generation Paper • 2408.06070 • Published Aug 12, 2024 • 54
Design Proteins Using Large Language Models: Enhancements and Comparative Analyses Paper • 2408.06396 • Published Aug 12, 2024 • 8
OpenResearcher: Unleashing AI for Accelerated Scientific Research Paper • 2408.06941 • Published Aug 13, 2024 • 33