huggingface-projects (Huggingface Projects)

AdinaY

posted an update 13 minutes ago

Post

RedNote 小红书 just released their first LLM 🔥

dots.llm1.base 🪐 a 142B MoE model with only 14B active params.

rednote-hilab/dotsllm1-68246aaaaba3363374a8aa7c
✨ Base & Instruct - MIT license
✨ Trained on 11.2T non-synthetic high-quality data
✨ Competitive with Qwen2.5/3 on reasoning, code, alignment

AdinaY

posted an update 14 minutes ago

Post

MiniCPM4🔥 efficient LLMs built for end-side devices, by OpenBMB

openbmb/minicpm4-6841ab29d180257e940baa9b

✨ Apache 2.0
✨ 5–7× Faster Inference (Jetson Orin & RTX 4090)
✨ 8B trained on 8T clean, non-synthetic tokens
✨ 32K Native Context -> 128K+ with InfLLM v2 + LongRoPE
✨ Runs on 🤗Transformers , http://CPM.cu, vLLM, and SGLang

ThomasSimonini

updated a dataset about 1 hour ago

huggingface-projects/Deep-RL-Course-Certification

Viewer • Updated about 1 hour ago • 1.45k • 207 • 13

jbilcke-hf

posted an update about 6 hours ago

Post

66

Did you know that there is a UI wrapper around https://github.com/a-r-r-o-w/finetrainers which is a great library made by @a-r-r-o-w for finetuning AI video models?

The UI is called VideoModelStudio (or VMS in casual chat)

All you have to do is to duplicate this space:
jbilcke-hf/VideoModelStudio

jbilcke-hf

posted an update about 7 hours ago

Post

44

Hi everyone,

I've seen some unsuccessful attempts at running Wan2GP inside a Hugging Face Space, which is a shame as it is a great Gradio app!

So here is a fork that you can use, with some instructions on how to do this:

jbilcke-hf/Wan2GP_you_must_clone_this_space_to_use_it#1

Note : some things like persistent models/storage/custom LoRAs might not be fully working out of the box. If you need those, you might have to dig into the Wan2GP codebase, see how to tweak the storage folder. Happy hacking!

AdinaY

posted an update 1 day ago

Post

1256

New models from Qwen 🔥

Qwen3-Embedding and Qwen3-Reranker Series just released on the hub by
Alibaba Qwen team.

✨ 0.6B/ 4B/ 8B with Apache2.0
✨ Supports 119 languages 🤯
✨ Top-tier performance: Leading the MTEB multilingual leaderboard！

Reranker:
Qwen/qwen3-reranker-6841b22d0192d7ade9cdefea
Embedding:
Qwen/qwen3-embedding-6841b2055b99c44d9a4c371f

merve

posted an update 1 day ago

Post

1502

Qwen2.5-Omni is soooo good that people build multimodal reasoning models off of it 🥹
> KE-Team/Ke-Omni-R-3B is open-source audio reasoning model sota on average of benchmarks, based on Qwen/Qwen2.5-Omni-3B 🗣️
> Haoz0206/Omni-R1 is a video reasoning model with pixel level grounding (see below) and it's super competitive ⏯️ based on Qwen/Qwen2.5-Omni-7B

AdinaY

posted an update 2 days ago

Post

1319

OpenAudio S1-mini 🔊 a new OPEN multilingual TTS model trained on 2M+ hours of data, by FishAudio

fishaudio/openaudio-s1-mini

✨ Supports 14 languages
✨ 50+ emotions & tones
✨ RLHF-optimized
✨ Special effects: laughing, crying, shouting, etc.

1 reply

·

Xenova

posted an update 2 days ago

Post

2087

NEW: Real-time conversational AI models can now run 100% locally in your browser! 🤯

🔐 Privacy by design (no data leaves your device)
💰 Completely free... forever
📦 Zero installation required, just visit a website
⚡️ Blazingly-fast WebGPU-accelerated inference

Try it out: webml-community/conversational-webgpu

For those interested, here's how it works:
- Silero VAD for voice activity detection
- Whisper for speech recognition
- SmolLM2-1.7B for text generation
- Kokoro for text to speech

Powered by Transformers.js and ONNX Runtime Web! 🤗 I hope you like it!

2 replies

·

merve

posted an update 3 days ago

Post

1402

Past week was insanely packed for open AI! 😱
Luckily we picked some highlights for you ❤️ lfg!

💬 LLMs/VLMs
> Deepseek 🐳 released deepseek-ai/DeepSeek-R1-0528, 38B model, only 0.2 and 1.4 points behind o3 in AIME 24/25 🤯 they also released an 8B distilled version based on Qwen3 (OS) deepseek-ai/deepseek-r1-678e1e131c0169c0bc89728d
> Xiaomi released MiMo-7B-RL (LLM for code and math) and MiMo-VL-7B-RL (VLM for visual reasoning, GUI agentic task and general use) (OS) 😍 XiaomiMiMo/mimo-vl-68382ccacc7c2875500cd212
> NVIDIA released , new reasoning model nvidia/Nemotron-Research-Reasoning-Qwen-1.5B
> DS: MiniMax released https://huggingface.co/MiniMaxAI/SynLogic, new 49k logical reasoning examples across 35 tasks including solving cipher, sudoku and more!

🖼️ Image/Video Generation
> tencent released tencent/HunyuanPortrait, a new model for consistent portrait generation with SVD Research license. They also released tencent/HunyuanVideo-Avatar, audio driven avatar generation (OS)
> showlab released showlab/OmniConsistency, consistent stylization model (OS)
> Rapidata/text-2-video-human-preferences-veo3 is a new T2V preference dataset based on videos from Veo3 with 46k examples (OS)

Audio🗣️
> https://huggingface.co/ResembleAI/Chatterbox is a new 500M text-to-speech model preferred more than ElevenLabs (OS) 😍
> PlayHT/PlayDiffusion is a new speech editing model (OS)

Other
> https://huggingface.co/NX-AI/TiReX is a new time series foundation model
> Yandex released a huge (4.79B examples!) video recommendation dataset https://huggingface.co/yandex/yambda

OS ones have Apache2.0 or MIT licenses, find more models and datasets here merve/releases-30-may-6840097345e0b1e915bff843

AdinaY

posted an update 3 days ago

Post

935

AReaL-boba² 🔥 A fully async RL system by Ant Research & Tsinghua.

Paper: AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning (2505.24298)
Model:
inclusionAI/areal-boba-2-683f0e819ccb7bb2e1b2f2d5

✨ 8B/14B/32B models, datasets & paper – all on the hub
✨ 2.77× faster training
✨ Native Agentic RL support

merve

posted an update 3 days ago

Post

1321

Yesterday was the day of vision language action models (VLAs)!

> SmolVLA: open-source small VLA for robotics by Hugging Face LeRobot team 🤖
Blog: https://huggingface.co/blog/smolvla
Model: lerobot/smolvla_base

> Holo-1: 3B & 7B web/computer use agentic VLAs by H Company 💻
Model family: Hcompany/holo1-683dd1eece7eb077b96d0cbd
Demo: https://huggingface.co/spaces/multimodalart/Holo1
Blog: https://huggingface.co/blog/Hcompany/holo1
super exciting times!!

merve

posted an update 4 days ago

Post

343

H Company released Holo-1: 3B and 7B GUI Action Vision Language Models for various web and computer agent tasks 🤗

Holo-1 has Apache 2.0 license and transformers support from day-0 🔥
> Read the blog: https://huggingface.co/blog/Hcompany/holo1
> Model repositories: Hcompany/holo1-683dd1eece7eb077b96d0cbd

AdinaY

posted an update 4 days ago

Post

832

SynLogic 🧠 logical reasoning model & dataset by MiniMax.

MiniMaxAI/synlogic-6836c3246fca0277657ff032

✨ 3 models: 7B/32B/ Mix-3-32B (MIT license)
✨ Dataset: 35 verifiable logic tasks (Sudoku, Cipher, Arrow Maze etc.)
✨ RL training with auto-verifiable rewards
✨ Generalizes to math without explicit math training
✨ +6 pts on BBEH, +9.5 on KOR-Bench vs baselines

andito

authored a paper 4 days ago

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published 5 days ago • 74

AdinaY

posted an update 4 days ago

Post

1617

Video-XL-2 🔥 long video understanding model by BAAI & Shanghai Jiaotong University

BAAI/Video-XL-2

✨ Apache 2.0
✨ Handles up to 10,000+ frames on a single GPU
✨ 2048-frame encoding in just 12s
✨ Efficient Chunk-based Prefilling & Bi-granularity KV decoding

merve

posted an update 5 days ago

Post

473

ColQwen2 just landed to transformers main 😍 vidore/colqwen2-v1.0-hf

use state-of-the-art visual document retrieval model ColQwen2 for your PDF retrieval or RAG pipelines 🎉

Here's a notebook to try right away: https://colab.research.google.com/drive/11_Vp6wB5RcQgK1MHt2M9On07EYXHH5E-?usp=sharing

AdinaY

posted an update 5 days ago

Post

2114

May highlights from China’s open source ecosystem 🔥

zh-ai-community/may-2025-open-works-from-the-chinese-community-681a3494145f2914dc679b7c

✨ DeepSeek dropped R1 updates
- Both R1 & 8B distralled smol model

✨ Bytedance goes big on open source:
- BAGEL, Dolphin, Seedcoder, Dream0...

✨ Multimodal is on fire!
- HuyuanCustom / HunyuanVideo-Avatar / HunyuanPortrait
- MiniMax: SynLogic / Orsta-7B
- Xiaomi: MiMo VL
- Alibaba Wan: Wan2.1-VACE
- OpenGVlab: ZeroGUI
- StepFun: ACE-Step-v1/Step1X-3D

✨ Specialized models/datasets excels
- Alibaba Qwen: World PM 72B
- BAAI:RobotBrain (MLLM for robotic)
- HiThink Research: BizFinBench (dataset)
- OpenBMB: Ultra FineWeb (dataset)
- Bilibili: Index-anisora (Anime/ACG)
- Skywork:Matrix-Game (game)

More awesome releases: Alibaba QwenLong-L1-32B, SkyWork OR1, OpenS2V-5M etc...

merve

posted an update 6 days ago

Post

1104

New GUI model by Salesforce AI & Uni HK: Jedi
tianbaoxiexxx/Jedi xlangai/Jedi-7B-1080p 🤗
Based on Qwen2.5-VL with Apache 2.0 license

prompt with below screenshot → select "find more"

3 replies

·

merve

posted an update 8 days ago

Post

1944

HOT: MiMo-VL new 7B vision LMs by Xiaomi surpassing gpt-4o (Mar), competitive in GUI agentic + reasoning tasks ❤️‍🔥 XiaomiMiMo/mimo-vl-68382ccacc7c2875500cd212

not only that, but also MIT license & usable with transformers 🔥

Huggingface Projects

AI & ML interests

Recent Activity

huggingface-projects's activity

huggingface-projects/Deep-RL-Course-Certification

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

AI & ML interests

Recent Activity

Team members 24

huggingface-projects's activity