Victor Mustar's picture

Victor Mustar PRO

victor

AI & ML interests

Building the UX of this website

Recent Activity

liked a model about 5 hours ago
rednote-hilab/dots.llm1.inst
liked a model about 5 hours ago
rednote-hilab/dots.llm1.base
View all activity

Organizations

Hugging Face's profile picture Google's profile picture Competitions's profile picture Safetensors's profile picture 21 RNN's profile picture Spaces-explorers's profile picture Text Generation Inference's profile picture CVPR Demo Track's profile picture Spaces Examples's profile picture Hugging Chat's profile picture Webhooks Explorers (BETA)'s profile picture lora concepts library's profile picture Scanned Tokens's profile picture Huggingface Projects's profile picture hf admins's profile picture Hugging Face OSS Metrics's profile picture Stable Diffusion Dreambooth Concepts Library's profile picture Core ML Projects's profile picture temp-org's profile picture Blog-explorers's profile picture Mustarz's profile picture Open LLM Leaderboard's profile picture Enterprise Explorers's profile picture The Collectionists's profile picture ZeroGPU Explorers's profile picture Hugging Face Tools's profile picture TstOrg141's profile picture Stable Video benchmark's profile picture Social Post Explorers's profile picture Dev Mode Explorers's profile picture LLHF's profile picture SLLHF's profile picture Self-serve FTW's profile picture Inference Explorers's profile picture hf-inference's profile picture Transformers Community's profile picture Changelog's profile picture Tiny Agents's profile picture Moon AI's profile picture Providers Metrics's profile picture

victor's activity

reacted to danielhanchen's post with 🔥 2 days ago
reacted to CultriX's post with 👍 2 days ago
view post
Post
827
Script for QA-style dataset generation from custom data:
Transform Your Personal Data into High-Quality Training Datasets with help from a LLM.

Inspired by a Reddit post (link below) I've created a script that converts custom documents into question-answer pairs for LLM fine-tuning.
What it does:
1. Split the input data into chunks (note: this is important, more below!)
2. QA generation: Creates contextually relevant question-answer pairs from each chunk.
3. Quality assurance: Validates outputs using both rule-based filters and LLM judges
4. Exports datasets in both CSV and JSON formats

Key features:
- Separate model configurations for generation and evaluation
- Configurable chunk sizes and question length
- Multi-language support (English and Dutch, but easy to add your own!)
- Local and cloud API compatibility

Quick start:
Place your documents (.txt for now) in an input folder and run:

python generate-rag-qav4.py \
  --input-dir ./rag-input/ \
  --output-dir ./rag-output/ \
  --output-filename finetuning_qa_dataset \
  --gen-model google/gemma-3-4b \
  --gen-api-base http://127.0.0.1:1234/v1 \
  --judge-model google/gemma-3-4b \
  --judge-api-base http://127.0.0.1:1234/v1 \
  --min-chunk-len 200 \
  --question-chars 20 \
  --answer-chars 5 \
  --lang en

Pro tip: The --min-chunk-len parameter is critical. Too short (< 150 chars) and questions lack context; too long (> 1000 chars) and the model struggles with focus. Start with 200-400 characters and adjust based on your content type!

Use cases:
- Personal knowledge base fine-tuning
- Domain-specific QA dataset creation
- RAG system training data preparation

Note: The script includes comprehensive error handling and progress tracking, and allows resuming progress should the process get interrupted.

Note2: Original Reddit post that gave me the idea:
https://www.reddit.com/r/LocalLLaMA/s/avkdzk8NSn

The script can be found here:
https://gist.github.com/CultriX-Github/9d53565214d56b12b9002a56230d1c00
  • 2 replies
·
reacted to Akhil-Theerthala's post with 👍 2 days ago
view post
Post
838
Kuvera v0.1.0 is now live!

A series of personal finance advisor models that try to resolve the queries by trying to understand the person’s psychological state and relevant context.

These are still prototypes that have much room for improvement.

What’s included in this release:
- Akhil-Theerthala/Kuvera-8B-v0.1.0: Qwen3-8B, meticulously fine-tuned on approximately 20,000 personal-finance inquiries.
- Akhil-Theerthala/Kuvera-14B-v0.1.0: LoRA on DeepSeek-R1-Distill-Qwen-14B, honed through training on about 10,000 chain-of-thought queries.

For those interested, the models and datasets are accessible for free (links in the comments). If you are curious about the upcoming version's roadmap, let’s connect—there are many more developments I plan to make, and would definitely appreciate any help.
reacted to danaaubakirova's post with ❤️ 2 days ago
reacted to azettl's post with 🤗 2 days ago
view post
Post
939
Agents & MCP Hackathon Day 2

Again, a short night, but here are some updates from my Hackathon projects before starting night #3.

I managed to get the first version of both submissions (custom Gradio component and MCP server) online! 

You can check the roundtable MCP where multiple AIs discuss your question and try to reach consensus: https://huggingface.co/spaces/azettl/consilium_mcp.

The Gradio component is here: https://huggingface.co/spaces/azettl/gradio_consilium_roundtable.

I placed my API keys in the env variables, so you can test without needing your own keys, but I will remove them soon as I did not find a limit setting in Sambanova. Still, you can check them by adding your own keys in the config tab.

Looking forward to your feedback, there are still many days I can and will improve this.
  • 1 reply
·
reacted to abidlabs's post with 🔥 2 days ago
view post
Post
1124
The Gradio x Agents x MCP hackathon keeps growing! We now have more $1,000,000 in credit for participants and and >$16,000 in cash prizes for winners.

We've kept registration open until the end of this week, so join and let's build cool stuff together as a community: ysharma/gradio-hackathon-registration-2025
reacted to m-ric's post with 🚀 2 days ago
view post
Post
1043
If you didn't yet, you should read the technical report for SmolVLA, published yesterday by the Hugging Face robotics team!
➡️ Amongst other ideas, it introduces "Async inference" to boost their robot actions.

Robots have a problem: performing the actions takes time (Unlike agents where action executions are near-instant!)
Most often, robots wait until they've finished performing actions to start thinking about hte next steps. This is a huge latency cost!

So the team decided to have the PolicyServer (aka the"thinking" part) restart early : instead of waiting for the n observations they just sent to be completed, they gather the observations after k < n steps, and start preparing the next actions based on that while the steps are running until n, to directly send their next steps.

➡️ This boosted robot throughput by ~30%! (nearly 2× tasks per time window).

gg @cadene and team! 👏

Report here: SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics (2506.01844)
reacted to vincentg64's post with 🚀 2 days ago
view post
Post
806
A New Type of Non-Standard High Performance DNN with Remarkable Stability – https://mltblog.com/3SA3OJ1

I explore deep neural networks (DNNs) starting from the foundations, introducing a new type of architecture, as much different from machine learning than it is from traditional AI. The original adaptive loss function introduced here for the first time, leads to spectacular performance improvements via a mechanism called equalization.

To accurately approximate any response, rather than connecting neurons with linear combinations and activation between layers, I use non-linear functions without activation, reducing the number of parameters, leading to explainability, easier fine tune, and faster training. The adaptive equalizer – a dynamical subsystem of its own – eliminates the linear part of the model, focusing on higher order interactions to accelerate convergence.

One example involves the Riemann zeta function. I exploit its well-known universality property to approximate any response. My system also handles singularities to deal with rare events or fraud detection. The loss function can be nowhere differentiable such as a Brownian motion. Many of the new discoveries are applicable to standard DNNs. Built from scratch, the Python code does not rely on any library other than Numpy. In particular, I do not use PyTorch, TensorFlow or Keras.

➡️ The PDF with many illustrations is available as paper 55, at https://mltblog.com/3EQd2cA. It also features the replicable Python code (with link to GitHub), the data generated by the code, the theory, and various options including for evaluation.

reacted to merve's post with 🚀 2 days ago
view post
Post
1311
Yesterday was the day of vision language action models (VLAs)!

> SmolVLA: open-source small VLA for robotics by Hugging Face LeRobot team 🤖
Blog: https://huggingface.co/blog/smolvla
Model: lerobot/smolvla_base

> Holo-1: 3B & 7B web/computer use agentic VLAs by H Company 💻
Model family: Hcompany/holo1-683dd1eece7eb077b96d0cbd
Demo: https://huggingface.co/spaces/multimodalart/Holo1
Blog: https://huggingface.co/blog/Hcompany/holo1
super exciting times!!
reacted to ginipick's post with 🔥 3 days ago
view post
Post
4017
🎨 FLUX VIDEO Generation - All-in-One AI Image/Video/Audio Generator

🚀 Introduction
FLUX VIDEO Generation is an all-in-one AI creative tool that generates images, videos, and audio from text prompts, powered by NVIDIA H100 GPU for lightning-fast processing!

ginigen/Flux-VIDEO

✨ Key Features
1️⃣ Text → Image → Video 🖼️➡️🎬

Generate high-quality images from Korean/English prompts
Transform still images into natural motion videos
Multiple size presets (Instagram, YouTube, Facebook, etc.)
Demo: 1-4 seconds / Full version: up to 60 seconds

2️⃣ Image Aspect Ratio Change 🎭

Freely adjust image aspect ratios
Expand images with outpainting technology
5 alignment options (Center, Left, Right, Top, Bottom)
Real-time preview functionality

3️⃣ Video + Audio Generation 🎵

Add AI-generated audio to videos
Korean prompt support (auto-translation)
Context-aware sound generation
Powered by MMAudio technology

🛠️ Tech Stack

Image Generation: FLUX, Stable Diffusion XL
Video Generation: TeaCache optimization
Audio Generation: MMAudio (44kHz high-quality)
Outpainting: ControlNet Union
Infrastructure: NVIDIA H100 GPU for ultra-fast generation

💡 How to Use

Select your desired tab
Enter your prompt (Korean/English supported!)
Adjust settings
Click generate button

🎯 Use Cases

📱 Social media content creation
🎥 YouTube Shorts/Reels
📊 Presentation materials
🎨 Creative artwork
🎵 Background sound generation
  • 1 reply
·
reacted to frascuchon's post with 👍 3 days ago
view post
Post
2904
Hey! I built RAG MCP Server Space, a simple Gradio MCP server for RAG systems that allows you to search relevant results without passing huge contexts to your LLM.

You can use this space to integrate with your agents and improve the efficiency of your search results. Feel free to try it out and let me know if you have any feedback or questions!

frascuchon/rag-mcp-server

Thanks for checking it out!
reacted to prithivMLmods's post with 👍 3 days ago
view post
Post
4762
OpenAI, Google, Hugging Face, and Anthropic have released guides and courses on building agents, prompting techniques, scaling AI use cases, and more. Below are 10+ minimalistic guides and courses that may help you in your progress. 📖

⤷ Agents Companion : https://www.kaggle.com/whitepaper-agent-companion
⤷ Building Effective Agents : https://www.anthropic.com/engineering/building-effective-agents
⤷ Guide to building agents by OpenAI : https://cdn.openai.com/business-guides-and-resources/a-practical-guide-to-building-agents.pdf
⤷ Prompt engineering by Google : https://www.kaggle.com/whitepaper-prompt-engineering
⤷ Google: 601 real-world gen AI use cases : https://cloud.google.com/transform/101-real-world-generative-ai-use-cases-from-industry-leaders
⤷ Prompt engineering by IBM : https://www.ibm.com/think/topics/prompt-engineering-guide
⤷ Prompt Engineering by Anthropic : https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview
⤷ Scaling AI use cases : https://cdn.openai.com/business-guides-and-resources/identifying-and-scaling-ai-use-cases.pdf
⤷ Prompting Guide 101 : https://services.google.com/fh/files/misc/gemini-for-google-workspace-prompting-guide-101.pdf
⤷ AI in the Enterprise by OpenAI : https://cdn.openai.com/business-guides-and-resources/ai-in-the-enterprise.pdf

by HF🤗 :
⤷ AI Agents Course by Huggingface : https://huggingface.co/learn/agents-course/unit0/introduction
⤷ Smol-agents Docs : https://huggingface.co/docs/smolagents/en/tutorials/building_good_agents
⤷ MCP Course by Huggingface : https://huggingface.co/learn/mcp-course/unit0/introduction
⤷ Other Course (LLM, Computer Vision, Deep RL, Audio, Diffusion, Cookbooks, etc..) : https://huggingface.co/learn
  • 2 replies
·
reacted to codelion's post with 🚀 4 days ago
view post
Post
3311
🧠 We just implemented Andrej Karpathy's "third paradigm" for LLM learning!

System Prompt Learning (SPL) enables LLMs to automatically learn problem-solving strategies from experience, rather than relying on static prompts.

🚀 How it works:
Your LLM builds a database of effective strategies, selects the best ones for each problem, and refines them over time based on success rates.

📊 Results across math benchmarks:
Arena Hard: 29% → 37.6% (+8.6%)
AIME24: 23.33% → 30% (+6.67%)
OptILLMBench: 61% → 65% (+4%)

The best part? All strategies are human-readable and the system gets progressively better at problem types you use frequently.

✨ Key benefits:
🔄 Cumulative learning over time
📖 Transparent, inspectable strategies
🔌 Works with any OpenAI-compatible API
⚡ Simple integration: just add "spl-" prefix to your model

Built as an open-source plugin in optillm. After 500 queries, our system developed 129 strategies and refined 97 of them!

This feels like a genuine step toward AI that learns from experience while staying completely interpretable.

🔗 GitHub: https://github.com/codelion/optillm/tree/main/optillm/plugins/spl
📖 Full article: https://huggingface.co/blog/codelion/system-prompt-learning
🐦 Original Karpathy tweet: https://x.com/karpathy/status/1921368644069765486

Have you experimented with advanced system prompting? What strategies would you want your LLM to learn?
reacted to Kseniase's post with 🚀 4 days ago
view post
Post
1792
13 Awesome MCP Servers

MCP changed how agents connect with tools.

After writing the most read explanation of MCP on Hugging Face (https://huggingface.co/blog/Kseniase/mcp), we chose this 13 awesome MCP servers that you can work with:

1. Agentset MCP -> https://github.com/agentset-ai/mcp-server
For efficient and quick building of intelligent, doc-based apps using open-source Agentset platform for RAG

2. GitHub MCP Server -> https://github.com/github/github-mcp-server
Integrates GitHub APIs into your workflow, allowing to build AI tools and apps that interact with GitHub's ecosystem

3. arXiv MCP -> https://github.com/andybrandt/mcp-simple-arxiv
Allows working with research papers on arXiv through effective search and access to their metadata, abstracts, and links

4. MCP Run Python -> https://github.com/pydantic/pydantic-ai/tree/main/mcp-run-python
Enables to run Python code in a sandbox via Pyodide in Deno, so it can be isolated from the rest of the operating system

5. Safe Local Python Executor -> https://github.com/maxim-saplin/mcp_safe_local_python_executor
A lightweight tool for running LLM-generated Python code locally, using Hugging Face’s LocalPythonExecutor (from smolagents framework) and exposing it via MCP for AI assistant integration

6. Cursor MCP Installer -> https://github.com/matthewdcage/cursor-mcp-installer
Allows to automatically add MCP servers to Cursor for development convenience

7. Basic Memory -> https://memory.basicmachines.co/docs/introduction
This knowledge management system connects to LLMs and lets you build a persistent semantic graph from AI conversations with AI agents

Read further in the comments 👇

If you like it, also subscribe to the Turing Post: https://www.turingpost.com/subscribe
  • 1 reply
·
reacted to clem's post with 🔥 4 days ago
view post
Post
5367
Today, we're unveiling two new open-source AI robots! HopeJR for $3,000 & Reachy Mini for $300 🤖🤖🤖

Let's go open-source AI robotics!
·
reacted to ProCreations's post with 🚀 4 days ago
view post
Post
2265
60 followers,
yay
  • 2 replies
·
reacted to jeffboudier's post with 🚀 9 days ago
reacted to openfree's post with 🔥 9 days ago
view post
Post
2561
🧠 AI Brand Naming with 15 Specialized Theories

🎯 Core Features
15 Expert Theories for professional brand naming
Bilingual Support Korean/English for global brands
Unified Evaluation System creativity/memorability/relevance scores
Real-time Visualization theory-specific custom designs

openfree/Naming

🔬 Applied Theories
Cognitive Theories (4)
🟦 Square Theory - Semantic square structure with 4-word relationships
🔊 Sound Symbolism - Psychological connections between phonemes and meaning
🧠 Cognitive Load - Minimized processing for instant recognition
👁️ Gestalt Theory - Perceptual principles where whole exceeds parts

Creative Theories (3)
🔀 Conceptual Blending - Merging concepts to create new meanings
🔧 SCAMPER Method - 7 creative transformation techniques
🌿 Biomimicry - Nature-inspired wisdom from 3.8 billion years of evolution

Strategic Theories (2)
✅ Jobs-to-be-Done - Customer-centric problem-solving focus
💭 Design Thinking - Human-centered innovation methodology

Cultural Theories (3)
🎭 Jung's Archetype - 12 universal archetypes for emotional connection
🌐 Linguistic Relativity - Cross-cultural thinking patterns consideration
🧬 Memetics - Cultural transmission and evolutionary potential

Differentiation Theories (3)
⚡ Von Restorff Effect - Uniqueness for 30x better recall
🎨 Color Psychology - Emotional associations and color meanings
🌍 Network Effects - Value maximization through network structures

💫 Special Features
Each theory provides unique visualizations and customized analysis:

Square Theory → 4-corner relationship diagram
Blending → Concept fusion flowchart
Color → Interactive color palette display
Theory-specific insights for each approach

🎨 Output Information
Core: Brand name, slogan, values, emotions, personality
Visual: Colors, concepts, typography styles
Linguistic: Pronunciation, etymology, global adaptability
Strategic: Differentiation, positioning, growth potential
Theory-specific...
reacted to fdaudens's post with 🤗 9 days ago
view post
Post
2861
🎵 Dream come true for content creators! TIGER AI can extract voice, effects & music from ANY audio file 🤯
This lightweight model uses frequency band-split technology to separate speech like magic. Kudos to @fffiloni for the amazing demo! fffiloni/TIGER-audio-extraction
reacted to ProCreations's post with 🚀 10 days ago
view post
Post
2875
Eyyyy 50 followers 🤯
  • 1 reply
·