Spaces:

ashutoshchoudhari
/

Search-Engine-LLM-app

Sleeping

App Files Files Community

ashutoshchoudhari commited on Mar 16

Commit

0b756d7

1 Parent(s): fbf720f

Enhance README.md with detailed project overview, features, technical components, and setup instructions for Langchain Search Assistant

Browse files

Files changed (1) hide show

README.md +129 -1

README.md CHANGED Viewed

@@ -11,4 +11,132 @@ license: mit
 short_description: This app allows you to chat with an LLM that can search web.
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 short_description: This app allows you to chat with an LLM that can search web.
 ---
+Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
+# Langchain Search Assistant
+## Overview
+This application is a powerful research assistant built with Langchain that can search across multiple knowledge sources including Wikipedia, arXiv, and the web via DuckDuckGo. It leverages Groq's LLM capabilities to provide intelligent, context-aware responses to user queries.
+## Features
+- **Multi-source search**: Access information from Wikipedia, arXiv scientific papers, and web results
+- **Conversational memory**: Retains context from previous interactions
+- **Streaming responses**: See the AI's response generated in real-time
+- **User-friendly interface**: Clean Streamlit UI for easy interaction
+## Technical Components
+- **LLM**: Groq's Llama3-8b-8192 model (with fallback support for Ollama models)
+- **Embeddings**: Hugging Face's all-MiniLM-L6-v2
+- **Search Tools**:
+  - Wikipedia API
+  - arXiv API
+  - DuckDuckGo Search
+- **Framework**: Langchain for agent orchestration
+- **Frontend**: Streamlit
+## Project Structure
+- **app.py**: Main application file containing the Streamlit UI and Langchain integration
+- **requirements.txt**: Dependencies required to run the application
+- **README.md**: Project metadata and description for Hugging Face Spaces
+- **tools_agents.ipynb**: Jupyter notebook demonstrating how to use Langchain tools and agents
+- **.github/workflows/main.yaml**: GitHub Actions workflow for deploying to Hugging Face Spaces
+- **.gitattributes**: Git LFS configuration for handling large files
+- **.gitignore**: Standard Python gitignore file
+- **LICENSE**: MIT License file
+- **app_documentation.md**: This comprehensive documentation file
+## Implementation Details
+### LLM Integration
+The application uses Groq's API to access the Llama3-8b-8192 model with streaming capability:
+```python
+llm = ChatGroq(
+    groq_api_key = st.session_state.api_key,
+    model_name = "Llama3-8b-8192",
+    streaming = True
+)
+```
+Alternative local models can also be configured with Ollama:
+```python
+#llm = ChatOllama(base_url=OLLAMA_WSL_IP, model="llama3.1", streaming=True)
+```
+### Search Tools Configuration
+The app configures three primary search tools:
+1. **Wikipedia Search**:
+```python
+api_wrapper_wiki = WikipediaAPIWrapper(top_k_results = 3, doc_content_chars_max=10000)
+wiki = WikipediaQueryRun(api_wrapper=api_wrapper_wiki)
+wiki_tool= Tool(
+    name = "Wikipedia",
+    func = wiki.run,
+    description = "This tool uses the Wikipedia API to search for a topic."
+)
+```
+2. **arXiv Search**:
+```python
+api_wrapper_arxiv = ArxivAPIWrapper(top_k_results = 5, doc_content_chars_max=10000)
+arxiv = ArxivQueryRun(api_wrapper=api_wrapper_arxiv)
+arxiv_tool = Tool(
+    name = "arxiv",
+    func = arxiv.run,
+    description = "Searches arXiv for papers matching the query.",
+)
+```
+3. **DuckDuckGo Web Search**:
+```python
+api_wrapper_ddg = DuckDuckGoSearchAPIWrapper(region="us-en", time="y", max_results=10)
+ddg = DuckDuckGoSearchResults(
+    api_wrapper=api_wrapper_ddg,
+    output_format="string",
+    handle_tool_error=True,
+    handle_validation_error=True)
+ddg_tool = Tool(
+    name = "DuckDuckGo_Search",
+    func = ddg.run,
+    description = "Searches for search queries using the DuckDuckGo Search engine."
+)
+```
+### Agent Configuration
+The system uses the CHAT_CONVERSATIONAL_REACT_DESCRIPTION agent type with a conversational memory buffer:
+```python
+memory = ConversationBufferWindowMemory(k=5, memory_key="chat_history", return_messages=True)
+search_agent = initialize_agent(
+    tools = tools,
+    llm = llm,
+    agent = AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
+    max_iterations = 10,
+    memory = memory,
+    handle_parsing_errors = True)
+```
+## Setup Requirements
+1. Groq API key
+2. Hugging Face token (for embeddings)
+3. Python environment with required dependencies
+## Installation Instructions
+Install the required packages using:
+```bash
+pip install -r requirements.txt
+```
+Required packages include:
+- arxiv
+- wikipedia
+- langchain, langchain-community, langchain-huggingface, langchain-groq
+- openai
+- duckduckgo-search
+- ollama, langchain-ollama (for local model support)
+## Environment Variables
+Create a `.env` file with the following variables: