Search-Engine-LLM-app / app_documentation.md
ashutoshchoudhari's picture
Add comprehensive documentation for Langchain Search Assistant
5808d90
|
raw
history blame
4.16 kB

Langchain Search Assistant

Overview

This application is a powerful research assistant built with Langchain that can search across multiple knowledge sources including Wikipedia, arXiv, and the web via DuckDuckGo. It leverages Groq's LLM capabilities to provide intelligent, context-aware responses to user queries.

Features

  • Multi-source search: Access information from Wikipedia, arXiv scientific papers, and web results
  • Conversational memory: Retains context from previous interactions
  • Streaming responses: See the AI's response generated in real-time
  • User-friendly interface: Clean Streamlit UI for easy interaction

Technical Components

  • LLM: Groq's Llama3-8b-8192 model (with fallback support for Ollama models)
  • Embeddings: Hugging Face's all-MiniLM-L6-v2
  • Search Tools:
    • Wikipedia API
    • arXiv API
    • DuckDuckGo Search
  • Framework: Langchain for agent orchestration
  • Frontend: Streamlit

Project Structure

  • app.py: Main application file containing the Streamlit UI and Langchain integration
  • requirements.txt: Dependencies required to run the application
  • README.md: Project metadata and description for Hugging Face Spaces
  • tools_agents.ipynb: Jupyter notebook demonstrating how to use Langchain tools and agents
  • .github/workflows/main.yaml: GitHub Actions workflow for deploying to Hugging Face Spaces
  • .gitattributes: Git LFS configuration for handling large files
  • .gitignore: Standard Python gitignore file
  • LICENSE: MIT License file
  • app_documentation.md: This comprehensive documentation file

Implementation Details

LLM Integration

The application uses Groq's API to access the Llama3-8b-8192 model with streaming capability:

llm = ChatGroq(
    groq_api_key = st.session_state.api_key, 
    model_name = "Llama3-8b-8192", 
    streaming = True
)

Alternative local models can also be configured with Ollama:

#llm = ChatOllama(base_url=OLLAMA_WSL_IP, model="llama3.1", streaming=True)

Search Tools Configuration

The app configures three primary search tools:

  1. Wikipedia Search:
api_wrapper_wiki = WikipediaAPIWrapper(top_k_results = 3, doc_content_chars_max=10000)
wiki = WikipediaQueryRun(api_wrapper=api_wrapper_wiki)
wiki_tool= Tool(
    name = "Wikipedia",
    func = wiki.run,
    description = "This tool uses the Wikipedia API to search for a topic."
)
  1. arXiv Search:
api_wrapper_arxiv = ArxivAPIWrapper(top_k_results = 5, doc_content_chars_max=10000)
arxiv = ArxivQueryRun(api_wrapper=api_wrapper_arxiv)
arxiv_tool = Tool(
    name = "arxiv",
    func = arxiv.run,
    description = "Searches arXiv for papers matching the query.",
)
  1. DuckDuckGo Web Search:
api_wrapper_ddg = DuckDuckGoSearchAPIWrapper(region="us-en", time="y", max_results=10)
ddg = DuckDuckGoSearchResults(
    api_wrapper=api_wrapper_ddg,
    output_format="string",
    handle_tool_error=True,
    handle_validation_error=True)
ddg_tool = Tool(
    name = "DuckDuckGo_Search",
    func = ddg.run,
    description = "Searches for search queries using the DuckDuckGo Search engine."
)

Agent Configuration

The system uses the CHAT_CONVERSATIONAL_REACT_DESCRIPTION agent type with a conversational memory buffer:

memory = ConversationBufferWindowMemory(k=5, memory_key="chat_history", return_messages=True)

search_agent = initialize_agent(
    tools = tools,
    llm = llm,
    agent = AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
    max_iterations = 10,
    memory = memory,
    handle_parsing_errors = True)

Setup Requirements

  1. Groq API key
  2. Hugging Face token (for embeddings)
  3. Python environment with required dependencies

Installation Instructions

Install the required packages using:

pip install -r requirements.txt

Required packages include:

  • arxiv
  • wikipedia
  • langchain, langchain-community, langchain-huggingface, langchain-groq
  • openai
  • duckduckgo-search
  • ollama, langchain-ollama (for local model support)

Environment Variables

Create a .env file with the following variables: