Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2503.14456

about 17 hours ago

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153

RuCCoD: Towards Automated ICD Coding in Russian

Paper • 2502.21263 • Published Feb 28 • 133
Unified Reward Model for Multimodal Understanding and Generation

Paper • 2503.05236 • Published Mar 7 • 124
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Paper • 2503.05179 • Published Mar 7 • 47
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

Paper • 2503.05592 • Published Mar 7 • 27

interesting papers

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 149
Agency Is Frame-Dependent

Paper • 2502.04403 • Published Feb 6 • 23
Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12 • 49
LLM Pretraining with Continuous Concepts

Paper • 2502.08524 • Published Feb 12 • 29

RWKV7 models 🪿

fla-hub/rwkv7-7.2B-g0

Text Generation • 7B • Updated 15 days ago • 124 • 2
fla-hub/rwkv7-2.9B-g1

Text Generation • 3B • Updated 15 days ago • 164 • 1
fla-hub/rwkv7-2.9B-world

Text Generation • 3B • Updated May 7 • 56 • 4
fla-hub/rwkv7-1.5B-g1

Text Generation • 2B • Updated 15 days ago • 45 • 1

interesting architecture

FAN: Fourier Analysis Networks

Paper • 2410.02675 • Published Oct 3, 2024 • 28
Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11 • 89
Scalable-Softmax Is Superior for Attention

Paper • 2501.19399 • Published Jan 31 • 22
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling

Paper • 2502.09509 • Published Feb 13 • 8

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders

Paper • 2503.03601 • Published Mar 5 • 233
Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 168
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Paper • 2503.11647 • Published Mar 14 • 144

RWKV-7 Goose related resources.

Goose-World/RWKV-World-v3

Viewer • Updated Apr 28 • 1.1M • 248 • 2
BlinkDL/rwkv-7-world

Text Generation • Updated May 31 • 104
BlinkDL/rwkv-7-pile

Updated Dec 19, 2024 • 16
Running

2

2

RWKV 7

🌏

best foundation model for its size !

Research Papers/Reviews/Literature

Daily Research papers and review including older relevant content.

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 61
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning

Paper • 2503.15265 • Published Mar 19 • 47
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Paper • 2503.15558 • Published Mar 18 • 51

Interesting shit 1

hexgrad/Kokoro-82M

Text-to-Speech • Updated Apr 10 • 1.8M • • 4.89k
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153

about 17 hours ago

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 29
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 13
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

interesting architecture

FAN: Fourier Analysis Networks

Paper • 2410.02675 • Published Oct 3, 2024 • 28
Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11 • 89
Scalable-Softmax Is Superior for Attention

Paper • 2501.19399 • Published Jan 31 • 22
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling

Paper • 2502.09509 • Published Feb 13 • 8

RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders

Paper • 2503.03601 • Published Mar 5 • 233
Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 168
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Paper • 2503.11647 • Published Mar 14 • 144

RuCCoD: Towards Automated ICD Coding in Russian

Paper • 2502.21263 • Published Feb 28 • 133
Unified Reward Model for Multimodal Understanding and Generation

Paper • 2503.05236 • Published Mar 7 • 124
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Paper • 2503.05179 • Published Mar 7 • 47
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

Paper • 2503.05592 • Published Mar 7 • 27

RWKV-7 Goose related resources.

Goose-World/RWKV-World-v3

Viewer • Updated Apr 28 • 1.1M • 248 • 2
BlinkDL/rwkv-7-world

Text Generation • Updated May 31 • 104
BlinkDL/rwkv-7-pile

Updated Dec 19, 2024 • 16
Running

2

2

RWKV 7

🌏

best foundation model for its size !

interesting papers

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published Feb 7 • 149
Agency Is Frame-Dependent

Paper • 2502.04403 • Published Feb 6 • 23
Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12 • 49
LLM Pretraining with Continuous Concepts

Paper • 2502.08524 • Published Feb 12 • 29

Research Papers/Reviews/Literature

Daily Research papers and review including older relevant content.

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Paper • 2501.18585 • Published Jan 30 • 61
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153
DeepMesh: Auto-Regressive Artist-mesh Creation with Reinforcement Learning

Paper • 2503.15265 • Published Mar 19 • 47
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Paper • 2503.15558 • Published Mar 18 • 51

RWKV7 models 🪿

fla-hub/rwkv7-7.2B-g0

Text Generation • 7B • Updated 15 days ago • 124 • 2
fla-hub/rwkv7-2.9B-g1

Text Generation • 3B • Updated 15 days ago • 164 • 1
fla-hub/rwkv7-2.9B-world

Text Generation • 3B • Updated May 7 • 56 • 4
fla-hub/rwkv7-1.5B-g1

Text Generation • 2B • Updated 15 days ago • 45 • 1

Interesting shit 1

hexgrad/Kokoro-82M

Text-to-Speech • Updated Apr 10 • 1.8M • • 4.89k
RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 153

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs