-
MLLM-as-a-Judge for Image Safety without Human Labeling
Paper • 2501.00192 • Published • 31 -
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Paper • 2501.00958 • Published • 107 -
Xmodel-2 Technical Report
Paper • 2412.19638 • Published • 27 -
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper • 2412.18925 • Published • 105
Collections
Discover the best community collections!
Collections including paper arxiv:2505.19443
-
s3: You Don't Need That Much Data to Train a Search Agent via RL
Paper • 2505.14146 • Published • 17 -
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI
Paper • 2505.19443 • Published • 15 -
ARM: Adaptive Reasoning Model
Paper • 2505.20258 • Published • 43 -
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles
Paper • 2505.19914 • Published • 40
-
CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging
Paper • 2502.05664 • Published • 23 -
AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation
Paper • 2312.13010 • Published • 5 -
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale
Paper • 2409.16299 • Published • 12 -
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI
Paper • 2505.19443 • Published • 15
-
Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models
Paper • 2504.07951 • Published • 28 -
Have we unified image generation and understanding yet? An empirical study of GPT-4o's image generation ability
Paper • 2504.08003 • Published • 49 -
SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
Paper • 2504.11468 • Published • 28 -
Towards Learning to Complete Anything in Lidar
Paper • 2504.12264 • Published • 10
-
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU
Paper • 2502.08910 • Published • 149 -
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens
Paper • 2502.18890 • Published • 30 -
MPO: Boosting LLM Agents with Meta Plan Optimization
Paper • 2503.02682 • Published • 27 -
SWE-rebench: An Automated Pipeline for Task Collection and Decontaminated Evaluation of Software Engineering Agents
Paper • 2505.20411 • Published • 85
-
LLM Pruning and Distillation in Practice: The Minitron Approach
Paper • 2408.11796 • Published • 59 -
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering
Paper • 2408.09174 • Published • 53 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 43 -
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications
Paper • 2408.11878 • Published • 61