A newer version of the Streamlit SDK is available:
1.48.1
metadata
title: Web Scraper & Q&A Chatbot with RAG
emoji: 🏃
colorFrom: blue
colorTo: yellow
sdk: streamlit
sdk_version: 1.43.1
app_file: app.py
pinned: false
short_description: 使用RAG的AI爬蟲對話機器人
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
繁體中文說明
這是一個結合「網頁爬蟲」與「RAG(檢索增強生成)」的 AI 對話機器人專案。
- 你可以輸入任意網址,系統會自動爬取該網頁(可設定多層遞迴與同網域限制),將內容分段並向量化存入本地資料庫。
- 之後可直接用中文或英文提問,系統會根據爬取內容檢索最相關段落,並用 Gemini LLM 生成回覆。
- 支援中文語意檢索,適合知識管理、網站摘要、FAQ 應用。
安裝與執行
- 安裝依賴:
pip install -r requirements.txt
- 複製
example.env
為.env
並填入你的 Gemini API 金鑰 - 執行:
streamlit run app.py
English Description
This project is a Web Scraper & RAG-based AI Chatbot.
- Enter any website URL, and the system will crawl the page (with configurable recursion depth and same-domain restriction), split and vectorize the content, and store it in a local database.
- You can then ask questions in Chinese or English. The system retrieves the most relevant content and generates answers using Gemini LLM.
- Optimized for Chinese semantic search, suitable for knowledge management, website summarization, and FAQ scenarios.
Installation & Usage
- Install dependencies:
pip install -r requirements.txt
- Copy
example.env
to.env
and fill in your Gemini API key - Run:
streamlit run app.py