Spaces:

Agents-MCP-Hackathon
/

mcp-rag-workflow

Running

App Files Files Community

Rajesh Betkiker commited on Jun 6

Commit

0321eee

1 Parent(s): 7422798

Added tools

Browse files

Files changed (9) hide show

.gitignore +174 -0
.python-version +1 -0
README.md +38 -6
mcp_client.py +37 -0
pyproject.toml +15 -0
requirements.txt +7 -0
tools/multi_agent_workflow_for_research.py +237 -0
tools/rag_tools.py +29 -0
uv.lock +0 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,174 @@

+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+cover/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+.pybuilder/
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+#   For a library or package, you might want to ignore these files since the code is
+#   intended to run in multiple environments; otherwise, check them in:
+# .python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# UV
+#   Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#uv.lock
+# poetry
+#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
+#poetry.lock
+# pdm
+#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
+#pdm.lock
+#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
+#   in version control.
+#   https://pdm.fming.dev/latest/usage/project/#working-with-version-control
+.pdm.toml
+.pdm-python
+.pdm-build/
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# pytype static type analyzer
+.pytype/
+# Cython debug symbols
+cython_debug/
+# PyCharm
+#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
+#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
+#  and can be added to the global gitignore or merged into this file.  For a more nuclear
+#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
+#.idea/
+# Ruff stuff:
+.ruff_cache/
+# PyPI configuration file
+.pypirc

.python-version ADDED Viewed

	@@ -0,0 +1 @@


1	+ 3.13

README.md CHANGED Viewed

@@ -1,14 +1,46 @@
 ---
-title: Mcp Hackathon
-emoji: 👁
 colorFrom: green
-colorTo: pink
 sdk: gradio
 sdk_version: 5.32.1
 app_file: app.py
 pinned: false
-license: apache-2.0
-short_description: Develop Model Context Protocol (MCP) server with Gradio App
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: MCP - RAG and Research
+emoji: 🌍
 colorFrom: green
+colorTo: blue
 sdk: gradio
 sdk_version: 5.32.1
 app_file: app.py
 pinned: false
+license: mit
+short_description: Demonstrates implementation of the MCP server using Gradio
+tag: mcp-server-track
 ---
+# 👁 MCP powered RAG and Research 🌍
+I present to you a MCP powered RAG and Research.
+RAG Tool uses GroundX service to fetch the knowledge base. The knowledge base is a document that contains information about the SU-35 aircraft, including its features, capabilities, and specifications.
+Please check [this PDF](https://airgroup2000.com/gallery/albums/userpics/32438/SU-35_TM_eng.pdf) to formulate queries on Sukhoi.
+The Research Tool is implemented using multi-agent workflow using LlamaIndex (ResearchAgent, WriteAgent, and ReviewAgent).
+## Available Tools
+### search_knowledge_base_for_context
+- **Description**: Searches and retrieves relevant context from a knowledge base based on the user's query.
+- **Example Queries**:
+    - "What are the main features of fuel system of SU-35."
+    - "What are the combat potential of SU-35."
+### research_write_review_topic
+- **Description**: Helps with writing a report with research, writing, and review on any topic.
+- **Example Queries**:
+    - "Write me a report on the history of the internet."
+    - "Write me a report on origin of the universe."
+    - "Write me a report on the impact of climate change on polar bears."
+## How to Use
+- Use the MCP RAG Tool tab above to query the knowledge base.
+- Use the Research Tool tab above to write report on any topic.
+## Demo Link
+[Link to Demo on Youtube](https://www.youtube.com/mcp-rag-research)
+Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

mcp_client.py ADDED Viewed

	@@ -0,0 +1,37 @@

+"""
+This is only a sample code snippet for a Gradio interface that connects to an MCP server. Server URL is set to local, it won't work on spaces.
+This script initializes a Gradio interface for an agent that uses tools from the MCP server.
+It connects to the MCP server, retrieves available tools, and sets up a chat interface where users can interact with the agent.
+"""
+import os
+from dotenv import load_dotenv
+load_dotenv()  # Load environment variables from .env file
+import gradio as gr
+from smolagents import InferenceClientModel, CodeAgent, MCPClient
+try:
+    mcp_client = MCPClient(
+        {
+            "url": "http://localhost:7860/gradio_api/mcp/sse",
+            "transport": "sse"
+        }
+    )
+    tools = mcp_client.get_tools()
+    model = InferenceClientModel(token=os.getenv("HUGGINGFACE_API_TOKEN"))
+    agent = CodeAgent(tools=[*tools], model=model)
+    demo = gr.ChatInterface(
+        fn=lambda message, history: str(agent.run(message)),
+        type="messages",
+        title="Agent with MCP Tools",
+        description="This is a simple MCP Client build with Gradio that uses MCP tools.",
+    )
+    demo.launch()
+finally:
+    mcp_client.disconnect()

pyproject.toml ADDED Viewed

	@@ -0,0 +1,15 @@

+[project]
+name = "mcp-rag-workflow"
+version = "0.1.0"
+description = "Model Context Protocol (MCP) server with RAG and Research Tools"
+readme = "README.md"
+requires-python = ">=3.13"
+dependencies = [
+    "gradio[mcp]",
+    "smolagents[mcp]",
+    "llama-index",
+    "llama-index-llms-nebius",
+    "groundx",
+    "duckduckgo-search",
+    "langchain-community>=0.3.24",
+]

requirements.txt ADDED Viewed

	@@ -0,0 +1,7 @@

+gradio[mcp]
+smolagents[mcp]
+llama-index
+llama-index-llms-nebius
+groundx
+duckduckgo-search
+langchain-community>=0.3.24

tools/multi_agent_workflow_for_research.py ADDED Viewed

	@@ -0,0 +1,237 @@

+"""
+This file contains the multi-agent workflow for the research project.
+Using LlamaIndex built a modular, intelligent multi-agent workflow.
+With real-time tools and structured memory.
+The workflow is as follows:
+1. The ResearchAgent searches the web for information.
+2. The WriteAgent writes a report based on the research notes.
+3. The ReviewAgent reviews the report and provides feedback.
+"""
+import os
+import asyncio
+# Load environment variables from .env file
+from dotenv import load_dotenv
+load_dotenv()
+from llama_index.llms.nebius import NebiusLLM
+# llama-index workflow classes
+from llama_index.core.workflow import Context
+from llama_index.core.agent.workflow import (
+    FunctionAgent,
+    AgentWorkflow,
+    AgentOutput,
+    ToolCall,
+    ToolCallResult,
+)
+from langchain.utilities import DuckDuckGoSearchAPIWrapper
+NEBIUS_API_KEY = os.getenv("NEBIUS_API_KEY")
+# Load an LLM
+llm = NebiusLLM(
+    api_key=NEBIUS_API_KEY,
+    model="meta-llama/Meta-Llama-3.1-8B-Instruct",
+    is_function_calling_model=True
+)
+# Search tools using DuckDuckGo
+duckduckgo = DuckDuckGoSearchAPIWrapper()
+MAX_SEARCH_CALLS = 2 # Limit the number of searches to 2
+search_call_count = 0
+past_queries = set()
+async def safe_duckduckgo_search(query: str) -> str:
+    """
+    A DuckDuckGo-based search function that:
+      - Prevents more than MAX_SEARCH_CALLS total searches.
+      - Skips duplicate queries.
+    """
+    global search_call_count, past_queries
+    # Check for duplicate queries
+    if query in past_queries:
+        return f"Already searched for '{query}'. Avoiding duplicate search."
+    # Check if we've reached the max search calls
+    if search_call_count >= MAX_SEARCH_CALLS:
+        return "Search limit reached, no more searches allowed."
+    # Otherwise, perform the search
+    search_call_count += 1
+    past_queries.add(query)
+    # DuckDuckGoSearchAPIWrapper.run(...) is synchronous, but we have an async signature
+    result = duckduckgo.run(query)
+    return str(result)
+# Research tools
+async def save_research(ctx: Context, notes: str, notes_title: str) -> str:
+    """
+    Store research notes under a given title in the shared context.
+    """
+    current_state = await ctx.get("state")
+    if "research_notes" not in current_state:
+        current_state["research_notes"] = {}
+    current_state["research_notes"][notes_title] = notes
+    await ctx.set("state", current_state)
+    return "Notes saved."
+# Report tools
+async def write_report(ctx: Context, report_content: str) -> str:
+    """
+    Write a report in markdown, storing it in the shared context.
+    """
+    current_state = await ctx.get("state")
+    current_state["report_content"] = report_content
+    await ctx.set("state", current_state)
+    return "Report written."
+# Review tools
+async def review_report(ctx: Context, review: str) -> str:
+    """
+    Review the report and store feedback in the shared context.
+    """
+    current_state = await ctx.get("state")
+    current_state["review"] = review
+    await ctx.set("state", current_state)
+    return "Report reviewed."
+# We have three agents with distinct responsibilities:
+# - The ResearchAgent is responsible for gathering information from the web.
+# - The WriteAgent is responsible for writing the report.
+# - The ReviewAgent is responsible for reviewing the report.
+# The ResearchAgent uses the DuckDuckGoSearchAPIWrapper to search the web.
+research_agent = FunctionAgent(
+    name="ResearchAgent",
+    description=(
+        "A research agent that searches the web using Google search through SerpAPI. "
+        "It must not exceed 2 searches total, and must avoid repeating the same query. "
+        "Once sufficient information is collected, it should hand off to the WriteAgent."
+    ),
+    system_prompt=(
+        "You are the ResearchAgent. Your goal is to gather sufficient information on the topic. "
+        "Only perform at most 2 distinct searches. If you have enough information or have reached 2 searches, "
+        "handoff to the WriteAgent. Avoid infinite loops! If search throws an error, stop further work and skip WriteAgent and ReviewAgent and return."
+        "Respect invocation limits and cooldown periods."
+    ),
+    llm=llm,
+    tools=[
+        safe_duckduckgo_search,
+        save_research
+    ],
+    max_iterations=2,  # Limit to 2 iterations to prevent infinite loops
+    cooldown=5,  # Cooldown to prevent rapid re-querying
+    can_handoff_to=["WriteAgent"]
+)
+write_agent = FunctionAgent(
+    name="WriteAgent",
+    description=(
+        "Writes a markdown report based on the research notes. "
+        "Then hands off to the ReviewAgent for feedback."
+    ),
+    system_prompt=(
+        "You are the WriteAgent. Draft a structured markdown report based on the notes. "
+        "If there is no report content or research notes, stop further work and skip ReviewAgent."
+        "Do not attempt more than one write attempt. "
+        "After writing, hand off to the ReviewAgent."
+        "Respect invocation limits and cooldown periods."
+    ),
+    llm=llm,
+    tools=[write_report],
+    max_iterations=2,  # Limit to 2 iterations to prevent infinite loops
+    cooldown=5,  # Cooldown to prevent rapid re-querying
+    can_handoff_to=["ReviewAgent", "ResearchAgent"]
+)
+review_agent = FunctionAgent(
+    name="ReviewAgent",
+    description=(
+        "Reviews the final report for correctness. Approves or requests changes."
+    ),
+    system_prompt=(
+        "You are the ReviewAgent. If there is no research notes or report content, skip this step and return."
+        "Do not attempt more than one review attempt. "
+        "Read the report, provide feedback, and either approve "
+        "or request revisions. If revisions are needed, handoff to WriteAgent."
+        "Respect invocation limits and cooldown periods."
+    ),
+    llm=llm,
+    tools=[review_report],
+    max_iterations=2,  # Limit to 2 iterations to prevent infinite loops
+    cooldown=5,  # Cooldown to prevent rapid re-querying
+    can_handoff_to=["WriteAgent"]
+)
+agent_workflow = AgentWorkflow(
+    agents=[research_agent, write_agent, review_agent],
+    root_agent=research_agent.name,  # Start with the ResearchAgent
+    initial_state={
+        "research_notes": {},
+        "report_content": "Not written yet.",
+        "review": "Review required.",
+    },
+)
+async def execute_research_workflow(query: str):
+    handler = agent_workflow.run(
+        user_msg=(
+            query
+        )
+    )
+    current_agent = None
+    async for event in handler.stream_events():
+        if hasattr(event, "current_agent_name") and event.current_agent_name != current_agent:
+            current_agent = event.current_agent_name
+            print(f"\n{'='*50}")
+            print(f"🤖 Agent: {current_agent}")
+            print(f"{'='*50}\n")
+        # Print outputs or tool calls
+        if isinstance(event, AgentOutput):
+            if event.response.content:
+                print("📤 Output:", event.response.content)
+            if event.tool_calls:
+                print("🛠️  Planning to use tools:", [call.tool_name for call in event.tool_calls])
+        elif isinstance(event, ToolCall):
+            print(f"🔨 Calling Tool: {event.tool_name}")
+            print(f"  With arguments: {event.tool_kwargs}")
+        elif isinstance(event, ToolCallResult):
+            print(f"🔧 Tool Result ({event.tool_name}):")
+            print(f"  Arguments: {event.tool_kwargs}")
+            print(f"  Output: {event.tool_output}")
+    return handler
+async def final_report(handler) -> str:
+    """Retrieve the final report from the context."""
+    final_state = await handler.ctx.get("state")
+    print("\n\n=============================")
+    print("FINAL REPORT:\n")
+    print(final_state["report_content"])
+    print("=============================\n")
+    return final_state["report_content"]
+def run_research_workflow(query: str):
+    handler = asyncio.run(execute_research_workflow(query))
+    result = asyncio.run(final_report(handler))
+    return result

tools/rag_tools.py ADDED Viewed

	@@ -0,0 +1,29 @@

+"""
+This file contains the tools for the RAG workflow.
+"""
+import os
+from groundx import GroundX
+from dotenv import load_dotenv
+load_dotenv()
+client = GroundX(api_key=os.getenv("GROUNDX_API_KEY") or '')
+def search_groundx_for_rag_context(query: str) -> str:
+    """
+    Searches and retrieves relevant context from a knowledge base,
+    based on the user's query.
+    Args:
+        query: The search query supplied by the user.
+    Returns:
+        str: Relevant text content that can be used by the LLM to answer the query.
+    """
+    response = client.search.content(
+        id=os.getenv("GROUNDX_BUCKET_ID"),
+        query=query,
+        n=10,
+    )
+    return response.search.text or "No relevant context found"

uv.lock ADDED Viewed

The diff for this file is too large to render. See raw diff