File size: 4,162 Bytes
5808d90
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
# Langchain Search Assistant

## Overview
This application is a powerful research assistant built with Langchain that can search across multiple knowledge sources including Wikipedia, arXiv, and the web via DuckDuckGo. It leverages Groq's LLM capabilities to provide intelligent, context-aware responses to user queries.

## Features
- **Multi-source search**: Access information from Wikipedia, arXiv scientific papers, and web results
- **Conversational memory**: Retains context from previous interactions
- **Streaming responses**: See the AI's response generated in real-time
- **User-friendly interface**: Clean Streamlit UI for easy interaction

## Technical Components
- **LLM**: Groq's Llama3-8b-8192 model (with fallback support for Ollama models)
- **Embeddings**: Hugging Face's all-MiniLM-L6-v2
- **Search Tools**:
  - Wikipedia API
  - arXiv API
  - DuckDuckGo Search
- **Framework**: Langchain for agent orchestration
- **Frontend**: Streamlit

## Project Structure
- **app.py**: Main application file containing the Streamlit UI and Langchain integration
- **requirements.txt**: Dependencies required to run the application
- **README.md**: Project metadata and description for Hugging Face Spaces
- **tools_agents.ipynb**: Jupyter notebook demonstrating how to use Langchain tools and agents
- **.github/workflows/main.yaml**: GitHub Actions workflow for deploying to Hugging Face Spaces
- **.gitattributes**: Git LFS configuration for handling large files
- **.gitignore**: Standard Python gitignore file
- **LICENSE**: MIT License file
- **app_documentation.md**: This comprehensive documentation file

## Implementation Details

### LLM Integration
The application uses Groq's API to access the Llama3-8b-8192 model with streaming capability:

```python
llm = ChatGroq(
    groq_api_key = st.session_state.api_key, 
    model_name = "Llama3-8b-8192", 
    streaming = True
)
```

Alternative local models can also be configured with Ollama:
```python
#llm = ChatOllama(base_url=OLLAMA_WSL_IP, model="llama3.1", streaming=True)
```

### Search Tools Configuration
The app configures three primary search tools:

1. **Wikipedia Search**:
```python
api_wrapper_wiki = WikipediaAPIWrapper(top_k_results = 3, doc_content_chars_max=10000)
wiki = WikipediaQueryRun(api_wrapper=api_wrapper_wiki)
wiki_tool= Tool(
    name = "Wikipedia",
    func = wiki.run,
    description = "This tool uses the Wikipedia API to search for a topic."
)
```

2. **arXiv Search**:
```python
api_wrapper_arxiv = ArxivAPIWrapper(top_k_results = 5, doc_content_chars_max=10000)
arxiv = ArxivQueryRun(api_wrapper=api_wrapper_arxiv)
arxiv_tool = Tool(
    name = "arxiv",
    func = arxiv.run,
    description = "Searches arXiv for papers matching the query.",
)
```

3. **DuckDuckGo Web Search**:
```python
api_wrapper_ddg = DuckDuckGoSearchAPIWrapper(region="us-en", time="y", max_results=10)
ddg = DuckDuckGoSearchResults(
    api_wrapper=api_wrapper_ddg,
    output_format="string",
    handle_tool_error=True,
    handle_validation_error=True)
ddg_tool = Tool(
    name = "DuckDuckGo_Search",
    func = ddg.run,
    description = "Searches for search queries using the DuckDuckGo Search engine."
)
```

### Agent Configuration
The system uses the CHAT_CONVERSATIONAL_REACT_DESCRIPTION agent type with a conversational memory buffer:

```python
memory = ConversationBufferWindowMemory(k=5, memory_key="chat_history", return_messages=True)

search_agent = initialize_agent(
    tools = tools,
    llm = llm,
    agent = AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
    max_iterations = 10,
    memory = memory,
    handle_parsing_errors = True)
```

## Setup Requirements
1. Groq API key
2. Hugging Face token (for embeddings)
3. Python environment with required dependencies

## Installation Instructions
Install the required packages using:

```bash
pip install -r requirements.txt
```

Required packages include:
- arxiv
- wikipedia
- langchain, langchain-community, langchain-huggingface, langchain-groq
- openai
- duckduckgo-search
- ollama, langchain-ollama (for local model support)

## Environment Variables
Create a `.env` file with the following variables: