File size: 1,710 Bytes
84f799d 6596a21 84f799d 9fadb81 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
---
title: Web Scraper & Q&A Chatbot with RAG
emoji: 🏃
colorFrom: blue
colorTo: yellow
sdk: streamlit
sdk_version: 1.43.1
app_file: app.py
pinned: false
short_description: 使用RAG的AI爬蟲對話機器人
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
---
## 繁體中文說明
這是一個結合「網頁爬蟲」與「RAG(檢索增強生成)」的 AI 對話機器人專案。
- 你可以輸入任意網址,系統會自動爬取該網頁(可設定多層遞迴與同網域限制),將內容分段並向量化存入本地資料庫。
- 之後可直接用中文或英文提問,系統會根據爬取內容檢索最相關段落,並用 Gemini LLM 生成回覆。
- 支援中文語意檢索,適合知識管理、網站摘要、FAQ 應用。
### 安裝與執行
1. 安裝依賴:`pip install -r requirements.txt`
2. 複製 `example.env` 為 `.env` 並填入你的 Gemini API 金鑰
3. 執行:`streamlit run app.py`
---
## English Description
This project is a Web Scraper & RAG-based AI Chatbot.
- Enter any website URL, and the system will crawl the page (with configurable recursion depth and same-domain restriction), split and vectorize the content, and store it in a local database.
- You can then ask questions in Chinese or English. The system retrieves the most relevant content and generates answers using Gemini LLM.
- Optimized for Chinese semantic search, suitable for knowledge management, website summarization, and FAQ scenarios.
### Installation & Usage
1. Install dependencies: `pip install -r requirements.txt`
2. Copy `example.env` to `.env` and fill in your Gemini API key
3. Run: `streamlit run app.py`
|