Spaces:
Running
A newer version of the Gradio SDK is available:
5.33.0
title: SmartManuals-AI
emoji: π§
colorFrom: indigo
colorTo: blue
sdk: gradio
sdk_version: 5.30.0
app_file: app.py
pinned: false
license: apache-2.0
tags:
- RAG
- LLM
- Chroma
- Gradio
- OCR
- HuggingFace
- PDF
- Word
- SemanticSearch
- SmartManualsAI
β SmartManuals-AI for Hugging Face Spaces
SmartManuals-AI is a local-first document QA system that uses retrieval-augmented generation (RAG), OCR, and semantic embeddings to answer technical questions from equipment manuals, service guides, and parts catalogs.
This app is optimized for Hugging Face Spaces and requires no user upload β just preload your manuals in a Manuals/
folder.
π§ Features
- π§ Ask natural-language questions against your own manuals
- π Supports both PDF and Word (.docx) documents
- π Uses
sentence-transformers
for semantic search - ποΈ Indexes chunks in ChromaDB (stored locally)
- π¬ Generates answers via Hugging Face models (default: Meta LLaMA 3.1 8B Instruct)
- π₯οΈ Clean Gradio interface for querying
π Folder Structure
SmartManuals-AI/
βββ app.py # Main Hugging Face app
βββ Manuals/ # Place your PDF and DOCX manuals here
β βββ OM_Treadmill.pdf
β βββ Parts_Bike.docx
βββ chroma_store/ # Vector database (auto-generated)
βββ requirements.txt # Dependencies
βββ README.md # This file
π Usage on Hugging Face Spaces
π Environment Variable
Add this secret in your Space settings:
Name | Value |
---|---|
HF_TOKEN |
Your Hugging Face token |
Note: You must accept model licenses on Hugging Face Hub before using gated models like
Llama-3.1-8B-Instruct
.
π€ Uploading Manuals
- Upload your PDF and Word documents directly to the
Manuals/
folder in your Space repository. - No need for file uploads via the interface.
π§ How It Works
On app startup:
- Text is extracted from PDFs (with OCR fallback) and
.docx
Word files - Sentences are cleaned, chunked, and embedded with
all-MiniLM-L6-v2
- Chunks are stored in a local ChromaDB vector database
- Text is extracted from PDFs (with OCR fallback) and
At query time:
- Your question is embedded and semantically compared against chunks
- The most relevant chunks are passed to the LLM
- The LLM (LLaMA 3.1) generates a focused answer from context only
π€ Default Model
- This app uses:
meta-llama/Llama-3.1-8B-Instruct
- More models are supported behind-the-scenes (e.g. Mistral, Gemma)
- No need to manually pick models, doc types, or categories
π§© Supported File Types
- β
PDF (
.pdf
) with OCR fallback using Tesseract - β
Word Documents (
.docx
)
π§ͺ Local Development
Clone and run locally:
git clone https://github.com/damoojeje/SmartManuals-AI.git
cd SmartManuals-AI
pip install -r requirements.txt
python app.py
π Place your manuals inside the
Manuals/
directory before running.
π¨π½βπ» Created By
Damilare Eniolabi
π§ damilareeniolabi@gmail.com
π GitHub: @damoojeje
π Tags
RAG
Β· LLM
Β· Gradio
Β· ChromaDB
Β· OCR
Β· SemanticSearch
Β· PDF
Β· Word
Β· SmartManualsAI
Β· EquipmentQA