Spaces:

ashutoshchoudhari
/

Search-Engine-LLM-app

Sleeping

App Files Files Community

Search-Engine-LLM-app / app_documentation.md

ashutoshchoudhari

Add comprehensive documentation for Langchain Search Assistant

5808d90 6 months ago

preview code

raw

history blame

4.16 kB

Langchain Search Assistant

Overview

This application is a powerful research assistant built with Langchain that can search across multiple knowledge sources including Wikipedia, arXiv, and the web via DuckDuckGo. It leverages Groq's LLM capabilities to provide intelligent, context-aware responses to user queries.

Features

Multi-source search: Access information from Wikipedia, arXiv scientific papers, and web results
Conversational memory: Retains context from previous interactions
Streaming responses: See the AI's response generated in real-time
User-friendly interface: Clean Streamlit UI for easy interaction

Technical Components

LLM: Groq's Llama3-8b-8192 model (with fallback support for Ollama models)
Embeddings: Hugging Face's all-MiniLM-L6-v2
Search Tools:
- Wikipedia API
- arXiv API
- DuckDuckGo Search
Framework: Langchain for agent orchestration
Frontend: Streamlit

Project Structure

app.py: Main application file containing the Streamlit UI and Langchain integration
requirements.txt: Dependencies required to run the application
README.md: Project metadata and description for Hugging Face Spaces
tools_agents.ipynb: Jupyter notebook demonstrating how to use Langchain tools and agents
.github/workflows/main.yaml: GitHub Actions workflow for deploying to Hugging Face Spaces
.gitattributes: Git LFS configuration for handling large files
.gitignore: Standard Python gitignore file
LICENSE: MIT License file
app_documentation.md: This comprehensive documentation file

Implementation Details

LLM Integration

The application uses Groq's API to access the Llama3-8b-8192 model with streaming capability:

llm = ChatGroq(
    groq_api_key = st.session_state.api_key, 
    model_name = "Llama3-8b-8192", 
    streaming = True
)

Alternative local models can also be configured with Ollama:

#llm = ChatOllama(base_url=OLLAMA_WSL_IP, model="llama3.1", streaming=True)

Search Tools Configuration

The app configures three primary search tools:

Wikipedia Search:

api_wrapper_wiki = WikipediaAPIWrapper(top_k_results = 3, doc_content_chars_max=10000)
wiki = WikipediaQueryRun(api_wrapper=api_wrapper_wiki)
wiki_tool= Tool(
    name = "Wikipedia",
    func = wiki.run,
    description = "This tool uses the Wikipedia API to search for a topic."
)

arXiv Search:

api_wrapper_arxiv = ArxivAPIWrapper(top_k_results = 5, doc_content_chars_max=10000)
arxiv = ArxivQueryRun(api_wrapper=api_wrapper_arxiv)
arxiv_tool = Tool(
    name = "arxiv",
    func = arxiv.run,
    description = "Searches arXiv for papers matching the query.",
)

DuckDuckGo Web Search:

api_wrapper_ddg = DuckDuckGoSearchAPIWrapper(region="us-en", time="y", max_results=10)
ddg = DuckDuckGoSearchResults(
    api_wrapper=api_wrapper_ddg,
    output_format="string",
    handle_tool_error=True,
    handle_validation_error=True)
ddg_tool = Tool(
    name = "DuckDuckGo_Search",
    func = ddg.run,
    description = "Searches for search queries using the DuckDuckGo Search engine."
)

Agent Configuration

The system uses the CHAT_CONVERSATIONAL_REACT_DESCRIPTION agent type with a conversational memory buffer:

memory = ConversationBufferWindowMemory(k=5, memory_key="chat_history", return_messages=True)

search_agent = initialize_agent(
    tools = tools,
    llm = llm,
    agent = AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
    max_iterations = 10,
    memory = memory,
    handle_parsing_errors = True)

Setup Requirements

Groq API key
Hugging Face token (for embeddings)
Python environment with required dependencies

Installation Instructions

Install the required packages using:

pip install -r requirements.txt

Required packages include:

arxiv
wikipedia
langchain, langchain-community, langchain-huggingface, langchain-groq
openai
duckduckgo-search
ollama, langchain-ollama (for local model support)

Environment Variables

Create a .env file with the following variables: