Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2502.05664

CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging

Paper • 2502.05664 • Published Feb 8 • 24
AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation

Paper • 2312.13010 • Published Dec 20, 2023 • 6
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale

Paper • 2409.16299 • Published Sep 9, 2024 • 12
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI

Paper • 2505.19443 • Published May 26 • 15

Benchmarks and challenges

PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models

Paper • 2502.01584 • Published Feb 3 • 10
CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging

Paper • 2502.05664 • Published Feb 8 • 24
Craw4LLM: Efficient Web Crawling for LLM Pretraining

Paper • 2502.13347 • Published Feb 19 • 29
Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark

Paper • 2504.16427 • Published Apr 23 • 18

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

Paper • 2411.02337 • Published Nov 4, 2024 • 38
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Paper • 2411.04996 • Published Nov 7, 2024 • 52
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 69
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

Paper • 2410.08815 • Published Oct 11, 2024 • 49

Code Generation

CodeEditorBench: Evaluating Code Editing Capability of Large Language Models

Paper • 2404.03543 • Published Apr 4, 2024 • 18
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Paper • 2406.11931 • Published Jun 17, 2024 • 66
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents

Paper • 2407.18901 • Published Jul 26, 2024 • 35
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents

Paper • 2408.07060 • Published Aug 13, 2024 • 43

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

Paper • 2402.14658 • Published Feb 22, 2024 • 84
meta-llama/CodeLlama-7b-Instruct-hf

Text Generation • 7B • Updated Mar 14, 2024 • 3.7k • 52
Qwen/Qwen2.5-Coder-32B-Instruct

Text Generation • 33B • Updated Jan 12 • 73.6k • • 1.92k
huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated

Text Generation • 33B • Updated Nov 25, 2024 • 920 • 32

CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging

Paper • 2502.05664 • Published Feb 8 • 24

Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise

Paper • 2501.08331 • Published Jan 14 • 20
MangaNinja: Line Art Colorization with Precise Reference Following

Paper • 2501.08332 • Published Jan 14 • 61
GameFactory: Creating New Games with Generative Interactive Videos

Paper • 2501.08325 • Published Jan 14 • 68
DiffuEraser: A Diffusion Model for Video Inpainting

Paper • 2501.10018 • Published Jan 17 • 15

about 10 hours ago

FLAME: Factuality-Aware Alignment for Large Language Models

Paper • 2405.01525 • Published May 2, 2024 • 29
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23, 2024 • 42
Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 55
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture

Paper • 2405.18991 • Published May 29, 2024 • 12

about 1 month ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Paper • 2412.14161 • Published Dec 18, 2024 • 52
Training Software Engineering Agents and Verifiers with SWE-Gym

Paper • 2412.21139 • Published Dec 30, 2024 • 24
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published Dec 27, 2024 • 88
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation

Paper • 2408.00764 • Published Aug 1, 2024 • 1

Large Language Model (LLM) and NLP related papers.

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 6
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 22
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 13
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 70

CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging

Paper • 2502.05664 • Published Feb 8 • 24
AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation

Paper • 2312.13010 • Published Dec 20, 2023 • 6
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale

Paper • 2409.16299 • Published Sep 9, 2024 • 12
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI

Paper • 2505.19443 • Published May 26 • 15

CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging

Paper • 2502.05664 • Published Feb 8 • 24

Benchmarks and challenges

PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models

Paper • 2502.01584 • Published Feb 3 • 10
CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging

Paper • 2502.05664 • Published Feb 8 • 24
Craw4LLM: Efficient Web Crawling for LLM Pretraining

Paper • 2502.13347 • Published Feb 19 • 29
Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark

Paper • 2504.16427 • Published Apr 23 • 18

Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise

Paper • 2501.08331 • Published Jan 14 • 20
MangaNinja: Line Art Colorization with Precise Reference Following

Paper • 2501.08332 • Published Jan 14 • 61
GameFactory: Creating New Games with Generative Interactive Videos

Paper • 2501.08325 • Published Jan 14 • 68
DiffuEraser: A Diffusion Model for Video Inpainting

Paper • 2501.10018 • Published Jan 17 • 15

WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning

Paper • 2411.02337 • Published Nov 4, 2024 • 38
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models

Paper • 2411.04996 • Published Nov 7, 2024 • 52
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Paper • 2411.03562 • Published Nov 5, 2024 • 69
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

Paper • 2410.08815 • Published Oct 11, 2024 • 49

about 10 hours ago

FLAME: Factuality-Aware Alignment for Large Language Models

Paper • 2405.01525 • Published May 2, 2024 • 29
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23, 2024 • 42
Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 55
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture

Paper • 2405.18991 • Published May 29, 2024 • 12

Code Generation

CodeEditorBench: Evaluating Code Editing Capability of Large Language Models

Paper • 2404.03543 • Published Apr 4, 2024 • 18
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

Paper • 2406.11931 • Published Jun 17, 2024 • 66
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents

Paper • 2407.18901 • Published Jul 26, 2024 • 35
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents

Paper • 2408.07060 • Published Aug 13, 2024 • 43

about 1 month ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Paper • 2412.14161 • Published Dec 18, 2024 • 52
Training Software Engineering Agents and Verifiers with SWE-Gym

Paper • 2412.21139 • Published Dec 30, 2024 • 24
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published Dec 27, 2024 • 88
AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation

Paper • 2408.00764 • Published Aug 1, 2024 • 1

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

Paper • 2402.14658 • Published Feb 22, 2024 • 84
meta-llama/CodeLlama-7b-Instruct-hf

Text Generation • 7B • Updated Mar 14, 2024 • 3.7k • 52
Qwen/Qwen2.5-Coder-32B-Instruct

Text Generation • 33B • Updated Jan 12 • 73.6k • • 1.92k
huihui-ai/Qwen2.5-Coder-32B-Instruct-abliterated

Text Generation • 33B • Updated Nov 25, 2024 • 920 • 32

Large Language Model (LLM) and NLP related papers.

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 6
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 22
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 13
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 70

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs