Commit
·
0b756d7
1
Parent(s):
fbf720f
Enhance README.md with detailed project overview, features, technical components, and setup instructions for Langchain Search Assistant
Browse files
README.md
CHANGED
@@ -11,4 +11,132 @@ license: mit
|
|
11 |
short_description: This app allows you to chat with an LLM that can search web.
|
12 |
---
|
13 |
|
14 |
-
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
short_description: This app allows you to chat with an LLM that can search web.
|
12 |
---
|
13 |
|
14 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
15 |
+
|
16 |
+
# Langchain Search Assistant
|
17 |
+
|
18 |
+
## Overview
|
19 |
+
This application is a powerful research assistant built with Langchain that can search across multiple knowledge sources including Wikipedia, arXiv, and the web via DuckDuckGo. It leverages Groq's LLM capabilities to provide intelligent, context-aware responses to user queries.
|
20 |
+
|
21 |
+
## Features
|
22 |
+
- **Multi-source search**: Access information from Wikipedia, arXiv scientific papers, and web results
|
23 |
+
- **Conversational memory**: Retains context from previous interactions
|
24 |
+
- **Streaming responses**: See the AI's response generated in real-time
|
25 |
+
- **User-friendly interface**: Clean Streamlit UI for easy interaction
|
26 |
+
|
27 |
+
## Technical Components
|
28 |
+
- **LLM**: Groq's Llama3-8b-8192 model (with fallback support for Ollama models)
|
29 |
+
- **Embeddings**: Hugging Face's all-MiniLM-L6-v2
|
30 |
+
- **Search Tools**:
|
31 |
+
- Wikipedia API
|
32 |
+
- arXiv API
|
33 |
+
- DuckDuckGo Search
|
34 |
+
- **Framework**: Langchain for agent orchestration
|
35 |
+
- **Frontend**: Streamlit
|
36 |
+
|
37 |
+
## Project Structure
|
38 |
+
- **app.py**: Main application file containing the Streamlit UI and Langchain integration
|
39 |
+
- **requirements.txt**: Dependencies required to run the application
|
40 |
+
- **README.md**: Project metadata and description for Hugging Face Spaces
|
41 |
+
- **tools_agents.ipynb**: Jupyter notebook demonstrating how to use Langchain tools and agents
|
42 |
+
- **.github/workflows/main.yaml**: GitHub Actions workflow for deploying to Hugging Face Spaces
|
43 |
+
- **.gitattributes**: Git LFS configuration for handling large files
|
44 |
+
- **.gitignore**: Standard Python gitignore file
|
45 |
+
- **LICENSE**: MIT License file
|
46 |
+
- **app_documentation.md**: This comprehensive documentation file
|
47 |
+
|
48 |
+
## Implementation Details
|
49 |
+
|
50 |
+
### LLM Integration
|
51 |
+
The application uses Groq's API to access the Llama3-8b-8192 model with streaming capability:
|
52 |
+
|
53 |
+
```python
|
54 |
+
llm = ChatGroq(
|
55 |
+
groq_api_key = st.session_state.api_key,
|
56 |
+
model_name = "Llama3-8b-8192",
|
57 |
+
streaming = True
|
58 |
+
)
|
59 |
+
```
|
60 |
+
|
61 |
+
Alternative local models can also be configured with Ollama:
|
62 |
+
```python
|
63 |
+
#llm = ChatOllama(base_url=OLLAMA_WSL_IP, model="llama3.1", streaming=True)
|
64 |
+
```
|
65 |
+
|
66 |
+
### Search Tools Configuration
|
67 |
+
The app configures three primary search tools:
|
68 |
+
|
69 |
+
1. **Wikipedia Search**:
|
70 |
+
```python
|
71 |
+
api_wrapper_wiki = WikipediaAPIWrapper(top_k_results = 3, doc_content_chars_max=10000)
|
72 |
+
wiki = WikipediaQueryRun(api_wrapper=api_wrapper_wiki)
|
73 |
+
wiki_tool= Tool(
|
74 |
+
name = "Wikipedia",
|
75 |
+
func = wiki.run,
|
76 |
+
description = "This tool uses the Wikipedia API to search for a topic."
|
77 |
+
)
|
78 |
+
```
|
79 |
+
|
80 |
+
2. **arXiv Search**:
|
81 |
+
```python
|
82 |
+
api_wrapper_arxiv = ArxivAPIWrapper(top_k_results = 5, doc_content_chars_max=10000)
|
83 |
+
arxiv = ArxivQueryRun(api_wrapper=api_wrapper_arxiv)
|
84 |
+
arxiv_tool = Tool(
|
85 |
+
name = "arxiv",
|
86 |
+
func = arxiv.run,
|
87 |
+
description = "Searches arXiv for papers matching the query.",
|
88 |
+
)
|
89 |
+
```
|
90 |
+
|
91 |
+
3. **DuckDuckGo Web Search**:
|
92 |
+
```python
|
93 |
+
api_wrapper_ddg = DuckDuckGoSearchAPIWrapper(region="us-en", time="y", max_results=10)
|
94 |
+
ddg = DuckDuckGoSearchResults(
|
95 |
+
api_wrapper=api_wrapper_ddg,
|
96 |
+
output_format="string",
|
97 |
+
handle_tool_error=True,
|
98 |
+
handle_validation_error=True)
|
99 |
+
ddg_tool = Tool(
|
100 |
+
name = "DuckDuckGo_Search",
|
101 |
+
func = ddg.run,
|
102 |
+
description = "Searches for search queries using the DuckDuckGo Search engine."
|
103 |
+
)
|
104 |
+
```
|
105 |
+
|
106 |
+
### Agent Configuration
|
107 |
+
The system uses the CHAT_CONVERSATIONAL_REACT_DESCRIPTION agent type with a conversational memory buffer:
|
108 |
+
|
109 |
+
```python
|
110 |
+
memory = ConversationBufferWindowMemory(k=5, memory_key="chat_history", return_messages=True)
|
111 |
+
|
112 |
+
search_agent = initialize_agent(
|
113 |
+
tools = tools,
|
114 |
+
llm = llm,
|
115 |
+
agent = AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
|
116 |
+
max_iterations = 10,
|
117 |
+
memory = memory,
|
118 |
+
handle_parsing_errors = True)
|
119 |
+
```
|
120 |
+
|
121 |
+
## Setup Requirements
|
122 |
+
1. Groq API key
|
123 |
+
2. Hugging Face token (for embeddings)
|
124 |
+
3. Python environment with required dependencies
|
125 |
+
|
126 |
+
## Installation Instructions
|
127 |
+
Install the required packages using:
|
128 |
+
|
129 |
+
```bash
|
130 |
+
pip install -r requirements.txt
|
131 |
+
```
|
132 |
+
|
133 |
+
Required packages include:
|
134 |
+
- arxiv
|
135 |
+
- wikipedia
|
136 |
+
- langchain, langchain-community, langchain-huggingface, langchain-groq
|
137 |
+
- openai
|
138 |
+
- duckduckgo-search
|
139 |
+
- ollama, langchain-ollama (for local model support)
|
140 |
+
|
141 |
+
## Environment Variables
|
142 |
+
Create a `.env` file with the following variables:
|