2 82 242

kelechic

tensorkelechi

https://kelechi-c.github.io/

AI & ML interests

vision

Recent Activity

liked a model 12 days ago

cherrvak/topkautoencoder_baseline

liked a model 16 days ago

gytdau/clip-sae-128

liked a model about 2 months ago

nateraw/musicgen-songstarter-v0.2

View all activity

Organizations

tensorkelechi's activity

upvoted a paper about 2 months ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 188

upvoted 2 papers 3 months ago

Neural Vocoder is All You Need for Speech Super-resolution

Paper • 2203.14941 • Published Mar 28, 2022 • 1

MusicInfuser: Making Video Diffusion Listen and Dance

Paper • 2503.14505 • Published Mar 18 • 11

upvoted 2 articles 3 months ago

Article

Open-Source Handwritten Signature Detection Model

•

Mar 14

• 113

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

and 3 others •

Mar 12

• 426

upvoted a paper 3 months ago

Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities

Paper • 2503.03983 • Published Mar 6 • 24

upvoted an article 3 months ago

Article

Using LoRA for Efficient Stable Diffusion Fine-Tuning

and 1 other •

Jan 26, 2023

• 66

upvoted a collection 4 months ago

Qwen2.5

Collection

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated Apr 28 • 616

upvoted a paper 4 months ago

SoundStorm: Efficient Parallel Audio Generation

Paper • 2305.09636 • Published May 16, 2023 • 13

upvoted a collection 4 months ago

CLAP: Contrastive Language-Audio Pretraining

Collection

CLAP is to audio what CLIP is to image. • 5 items • Updated Oct 31, 2023 • 11

upvoted an article 4 months ago

Article

Design choices for Vision Language Models in 2024

•

Apr 16, 2024

• 28

upvoted a paper 4 months ago

Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities

Paper • 2402.01831 • Published Feb 2, 2024 • 15

upvoted 2 articles 4 months ago

Article

SmolVLM - small yet mighty Vision Language Model

and 4 others •

Nov 26, 2024

• 306

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

and 2 others •

Jan 23

• 180

upvoted a paper 4 months ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 232

upvoted an article 4 months ago

Article

State of open video generation models in Diffusers

and 2 others •

Jan 27

• 53

upvoted an article 5 months ago

Article

Upgrading Kokoro: natural TTS for short bursts

•

Nov 22, 2024

• 28

upvoted a paper 5 months ago

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 115

upvoted a collection 5 months ago

Cosmos Tokenizer

Collection

A suite of image and video tokenizers • 13 items • Updated about 15 hours ago • 40

upvoted a paper 5 months ago

MobileVLM V2: Faster and Stronger Baseline for Vision Language Model

Paper • 2402.03766 • Published Feb 6, 2024 • 15