ashutoshchoudhari commited on
Commit
0b756d7
·
1 Parent(s): fbf720f

Enhance README.md with detailed project overview, features, technical components, and setup instructions for Langchain Search Assistant

Browse files
Files changed (1) hide show
  1. README.md +129 -1
README.md CHANGED
@@ -11,4 +11,132 @@ license: mit
11
  short_description: This app allows you to chat with an LLM that can search web.
12
  ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  short_description: This app allows you to chat with an LLM that can search web.
12
  ---
13
 
14
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
15
+
16
+ # Langchain Search Assistant
17
+
18
+ ## Overview
19
+ This application is a powerful research assistant built with Langchain that can search across multiple knowledge sources including Wikipedia, arXiv, and the web via DuckDuckGo. It leverages Groq's LLM capabilities to provide intelligent, context-aware responses to user queries.
20
+
21
+ ## Features
22
+ - **Multi-source search**: Access information from Wikipedia, arXiv scientific papers, and web results
23
+ - **Conversational memory**: Retains context from previous interactions
24
+ - **Streaming responses**: See the AI's response generated in real-time
25
+ - **User-friendly interface**: Clean Streamlit UI for easy interaction
26
+
27
+ ## Technical Components
28
+ - **LLM**: Groq's Llama3-8b-8192 model (with fallback support for Ollama models)
29
+ - **Embeddings**: Hugging Face's all-MiniLM-L6-v2
30
+ - **Search Tools**:
31
+ - Wikipedia API
32
+ - arXiv API
33
+ - DuckDuckGo Search
34
+ - **Framework**: Langchain for agent orchestration
35
+ - **Frontend**: Streamlit
36
+
37
+ ## Project Structure
38
+ - **app.py**: Main application file containing the Streamlit UI and Langchain integration
39
+ - **requirements.txt**: Dependencies required to run the application
40
+ - **README.md**: Project metadata and description for Hugging Face Spaces
41
+ - **tools_agents.ipynb**: Jupyter notebook demonstrating how to use Langchain tools and agents
42
+ - **.github/workflows/main.yaml**: GitHub Actions workflow for deploying to Hugging Face Spaces
43
+ - **.gitattributes**: Git LFS configuration for handling large files
44
+ - **.gitignore**: Standard Python gitignore file
45
+ - **LICENSE**: MIT License file
46
+ - **app_documentation.md**: This comprehensive documentation file
47
+
48
+ ## Implementation Details
49
+
50
+ ### LLM Integration
51
+ The application uses Groq's API to access the Llama3-8b-8192 model with streaming capability:
52
+
53
+ ```python
54
+ llm = ChatGroq(
55
+ groq_api_key = st.session_state.api_key,
56
+ model_name = "Llama3-8b-8192",
57
+ streaming = True
58
+ )
59
+ ```
60
+
61
+ Alternative local models can also be configured with Ollama:
62
+ ```python
63
+ #llm = ChatOllama(base_url=OLLAMA_WSL_IP, model="llama3.1", streaming=True)
64
+ ```
65
+
66
+ ### Search Tools Configuration
67
+ The app configures three primary search tools:
68
+
69
+ 1. **Wikipedia Search**:
70
+ ```python
71
+ api_wrapper_wiki = WikipediaAPIWrapper(top_k_results = 3, doc_content_chars_max=10000)
72
+ wiki = WikipediaQueryRun(api_wrapper=api_wrapper_wiki)
73
+ wiki_tool= Tool(
74
+ name = "Wikipedia",
75
+ func = wiki.run,
76
+ description = "This tool uses the Wikipedia API to search for a topic."
77
+ )
78
+ ```
79
+
80
+ 2. **arXiv Search**:
81
+ ```python
82
+ api_wrapper_arxiv = ArxivAPIWrapper(top_k_results = 5, doc_content_chars_max=10000)
83
+ arxiv = ArxivQueryRun(api_wrapper=api_wrapper_arxiv)
84
+ arxiv_tool = Tool(
85
+ name = "arxiv",
86
+ func = arxiv.run,
87
+ description = "Searches arXiv for papers matching the query.",
88
+ )
89
+ ```
90
+
91
+ 3. **DuckDuckGo Web Search**:
92
+ ```python
93
+ api_wrapper_ddg = DuckDuckGoSearchAPIWrapper(region="us-en", time="y", max_results=10)
94
+ ddg = DuckDuckGoSearchResults(
95
+ api_wrapper=api_wrapper_ddg,
96
+ output_format="string",
97
+ handle_tool_error=True,
98
+ handle_validation_error=True)
99
+ ddg_tool = Tool(
100
+ name = "DuckDuckGo_Search",
101
+ func = ddg.run,
102
+ description = "Searches for search queries using the DuckDuckGo Search engine."
103
+ )
104
+ ```
105
+
106
+ ### Agent Configuration
107
+ The system uses the CHAT_CONVERSATIONAL_REACT_DESCRIPTION agent type with a conversational memory buffer:
108
+
109
+ ```python
110
+ memory = ConversationBufferWindowMemory(k=5, memory_key="chat_history", return_messages=True)
111
+
112
+ search_agent = initialize_agent(
113
+ tools = tools,
114
+ llm = llm,
115
+ agent = AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
116
+ max_iterations = 10,
117
+ memory = memory,
118
+ handle_parsing_errors = True)
119
+ ```
120
+
121
+ ## Setup Requirements
122
+ 1. Groq API key
123
+ 2. Hugging Face token (for embeddings)
124
+ 3. Python environment with required dependencies
125
+
126
+ ## Installation Instructions
127
+ Install the required packages using:
128
+
129
+ ```bash
130
+ pip install -r requirements.txt
131
+ ```
132
+
133
+ Required packages include:
134
+ - arxiv
135
+ - wikipedia
136
+ - langchain, langchain-community, langchain-huggingface, langchain-groq
137
+ - openai
138
+ - duckduckgo-search
139
+ - ollama, langchain-ollama (for local model support)
140
+
141
+ ## Environment Variables
142
+ Create a `.env` file with the following variables: