damoojeje commited on
Commit
d0bba59
Β·
verified Β·
1 Parent(s): fcbea64

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -11
README.md CHANGED
@@ -1,14 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- title: SmartManuals AI
3
- emoji: πŸ’¬
4
- colorFrom: yellow
5
- colorTo: purple
6
- sdk: gradio
7
- sdk_version: 5.0.1
8
- app_file: app.py
9
- pinned: false
10
- license: mit
11
- short_description: About Local semantic Q&A system for equipment manuals and pa
 
 
 
 
 
 
12
  ---
13
 
14
- An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
 
 
1
+ # βœ… SmartManuals-AI for Hugging Face Spaces
2
+
3
+ SmartManuals-AI is a local-first document QA system that uses RAG (retrieval-augmented generation), OCR, and embedding search to answer technical questions from PDFs **and Word documents**.
4
+
5
+ ---
6
+
7
+ ## πŸ”§ Features
8
+
9
+ - πŸ” **Ask natural-language questions** to your manuals
10
+ - πŸ“„ Handles both **PDFs** and **Word `.docx`** files
11
+ - 🧠 Uses **semantic search** with `sentence-transformers`
12
+ - πŸ—ƒοΈ ChromaDB for fast local vector indexing
13
+ - πŸ’¬ Answers generated by **Meta LLaMA 3.1 8B Instruct** (default)
14
+ - πŸ“Š Gradio dashboard for interaction
15
+
16
+ ---
17
+
18
+ ## πŸ“ Folder Structure
19
+ ```
20
+ SmartManuals-AI/
21
+ β”œβ”€β”€ app.py # Hugging Face Spaces main app
22
+ β”œβ”€β”€ Manuals/ # πŸ“‚ Upload your PDF and Word manuals here
23
+ β”‚ β”œβ”€β”€ OM_Treadmill.pdf
24
+ β”‚ └── Parts_Bike.docx
25
+ β”œβ”€β”€ chroma_store/ # ⛓️ ChromaDB vector DB (auto-generated)
26
+ β”œβ”€β”€ requirements.txt # πŸ“¦ Dependencies
27
+ └── README.md # πŸ“– This file
28
+ ```
29
+
30
+ ---
31
+
32
+ ## πŸš€ Usage in Hugging Face Spaces
33
+
34
+ ### πŸ” Environment Variables
35
+ Add your Hugging Face token as a secret:
36
+
37
+ - `HF_TOKEN`: Your Hugging Face access token (required for gated models)
38
+
39
+ ### πŸ“€ Upload Your Files
40
+ Put all your manuals (PDF and Word `.docx`) into the `Manuals/` folder.
41
+
42
+ ### 🧠 App Behavior
43
+ - On startup:
44
+ - Extracts text (with OCR fallback) from PDFs
45
+ - Extracts clean text from Word documents
46
+ - Chunks and embeds content into ChromaDB
47
+ - During inference:
48
+ - Retrieves semantically relevant chunks
49
+ - Sends them to LLaMA 3.1 Instruct for answer generation
50
+
51
+ ### ❌ No User Upload
52
+ This app is **designed to work without file uploads**. All processing is done on preloaded files in the `Manuals/` directory.
53
+
54
+ ---
55
+
56
+ ## 🧠 Default Model
57
+ - Uses **`meta-llama/Llama-3.1-8B-Instruct`**
58
+ - All question answering is **fully automatic**
59
+ - User is **not required to pick a model, doc type, or filter** β€” the system decides based on question and content.
60
+
61
+ ---
62
+
63
+ ## 🧩 Supported File Types
64
+ - `.pdf` (with OCR for scanned pages)
65
+ - `.docx` (via `python-docx`)
66
+
67
  ---
68
+
69
+ ## πŸ§ͺ Local Development
70
+ Install dependencies:
71
+ ```bash
72
+ pip install -r requirements.txt
73
+ ```
74
+ Run locally:
75
+ ```bash
76
+ python app.py
77
+ ```
78
+
79
+ ---
80
+
81
+ ## πŸ‘¨πŸ½β€πŸ’» Project by: [Damilare Eniolabi](mailto:damilareeniolabi@gmail.com)
82
+ GitHub: [@damoojeje](https://github.com/damoojeje)
83
+
84
  ---
85
 
86
+ ## πŸ“Œ Tags
87
+ `RAG` `LLM` `Chroma` `OCR` `PDF` `Word` `Gradio` `HuggingFace` `SmartManualsAI`