Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2406.09415

about 18 hours ago

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Running

2.59k

2.59k

Anycoder

🏢

Generate HTML/CSS code from images
Runtime error

274

274

Qwen2.5 Coder Artifacts

🐢

Generate application code with Qwen2.5-Coder-32B
Running

922

922

QwQ-32B-Preview

🔍

QwQ-32B-Preview
Running on CPU Upgrade

13.4k

13.4k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots

Depth Anything V2

Paper • 2406.09414 • Published Jun 13, 2024 • 104
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 52
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion

Paper • 2406.04338 • Published Jun 6, 2024 • 40
SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1, 2024 • 116

Depth Estimation

Depth Anything V2

Paper • 2406.09414 • Published Jun 13, 2024 • 104
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 52

Relevant-Papers-Midterm

Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models

Paper • 2402.14848 • Published Feb 19, 2024 • 20
The Prompt Report: A Systematic Survey of Prompting Techniques

Paper • 2406.06608 • Published Jun 6, 2024 • 66
CRAG -- Comprehensive RAG Benchmark

Paper • 2406.04744 • Published Jun 7, 2024 • 49
Transformers meet Neural Algorithmic Reasoners

Paper • 2406.09308 • Published Jun 13, 2024 • 45

An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 52

An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 52
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities

Paper • 2406.09406 • Published Jun 13, 2024 • 15
VideoGUI: A Benchmark for GUI Automation from Instructional Videos

Paper • 2406.10227 • Published Jun 14, 2024 • 9
What If We Recaption Billions of Web Images with LLaMA-3?

Paper • 2406.08478 • Published Jun 12, 2024 • 42

Cellular Automata DL

An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 52

An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 52

Cognitively Inspired Energy-Based World Models

Paper • 2406.08862 • Published Jun 13, 2024 • 10
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 52
OpenVLA: An Open-Source Vision-Language-Action Model

Paper • 2406.09246 • Published Jun 13, 2024 • 42

about 18 hours ago

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 52

Running

2.59k

2.59k

Anycoder

🏢

Generate HTML/CSS code from images
Runtime error

274

274

Qwen2.5 Coder Artifacts

🐢

Generate application code with Qwen2.5-Coder-32B
Running

922

922

QwQ-32B-Preview

🔍

QwQ-32B-Preview
Running on CPU Upgrade

13.4k

13.4k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots

An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 52
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities

Paper • 2406.09406 • Published Jun 13, 2024 • 15
VideoGUI: A Benchmark for GUI Automation from Instructional Videos

Paper • 2406.10227 • Published Jun 14, 2024 • 9
What If We Recaption Billions of Web Images with LLaMA-3?

Paper • 2406.08478 • Published Jun 12, 2024 • 42

Depth Anything V2

Paper • 2406.09414 • Published Jun 13, 2024 • 104
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 52
Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion

Paper • 2406.04338 • Published Jun 6, 2024 • 40
SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1, 2024 • 116

Cellular Automata DL

An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 52

Depth Estimation

Depth Anything V2

Paper • 2406.09414 • Published Jun 13, 2024 • 104
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 52

An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 52

Relevant-Papers-Midterm

Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models

Paper • 2402.14848 • Published Feb 19, 2024 • 20
The Prompt Report: A Systematic Survey of Prompting Techniques

Paper • 2406.06608 • Published Jun 6, 2024 • 66
CRAG -- Comprehensive RAG Benchmark

Paper • 2406.04744 • Published Jun 7, 2024 • 49
Transformers meet Neural Algorithmic Reasoners

Paper • 2406.09308 • Published Jun 13, 2024 • 45

Cognitively Inspired Energy-Based World Models

Paper • 2406.08862 • Published Jun 13, 2024 • 10
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels

Paper • 2406.09415 • Published Jun 13, 2024 • 52
OpenVLA: An Open-Source Vision-Language-Action Model

Paper • 2406.09246 • Published Jun 13, 2024 • 42

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs