|
--- |
|
title: Web Scraper & Q&A Chatbot with RAG |
|
emoji: 🏃 |
|
colorFrom: blue |
|
colorTo: yellow |
|
sdk: streamlit |
|
sdk_version: 1.43.1 |
|
app_file: app.py |
|
pinned: false |
|
short_description: 使用RAG的AI爬蟲對話機器人 |
|
--- |
|
|
|
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |
|
|
|
|
|
--- |
|
|
|
|
|
## 繁體中文說明 |
|
|
|
這是一個結合「網頁爬蟲」與「RAG(檢索增強生成)」的 AI 對話機器人專案。 |
|
- 你可以輸入任意網址,系統會自動爬取該網頁(可設定多層遞迴與同網域限制),將內容分段並向量化存入本地資料庫。 |
|
- 之後可直接用中文或英文提問,系統會根據爬取內容檢索最相關段落,並用 Gemini LLM 生成回覆。 |
|
- 支援中文語意檢索,適合知識管理、網站摘要、FAQ 應用。 |
|
|
|
### 安裝與執行 |
|
1. 安裝依賴:`pip install -r requirements.txt` |
|
2. 複製 `example.env` 為 `.env` 並填入你的 Gemini API 金鑰 |
|
3. 執行:`streamlit run app.py` |
|
|
|
--- |
|
|
|
## English Description |
|
|
|
This project is a Web Scraper & RAG-based AI Chatbot. |
|
- Enter any website URL, and the system will crawl the page (with configurable recursion depth and same-domain restriction), split and vectorize the content, and store it in a local database. |
|
- You can then ask questions in Chinese or English. The system retrieves the most relevant content and generates answers using Gemini LLM. |
|
- Optimized for Chinese semantic search, suitable for knowledge management, website summarization, and FAQ scenarios. |
|
|
|
### Installation & Usage |
|
1. Install dependencies: `pip install -r requirements.txt` |
|
2. Copy `example.env` to `.env` and fill in your Gemini API key |
|
3. Run: `streamlit run app.py` |
|
|