Spaces:

onisj
/

jarvis_gaia_agent

Starting

App Files Files Community

onisj commited on May 29

Commit

4701375

1 Parent(s): 488dc3e

feat(advance): Deploy corrected app.py and tools fo advance functions

Browse files

Files changed (20) hide show

README.md +56 -16
__init__.py +0 -0
app.py +281 -281
dockerfile +3 -6
graph.py +0 -143
project_struct.txt +21 -0
requirements.txt +18 -89
retriever.py +34 -0
state_log.json +0 -0
tools/__init__.py +5 -1
tools/calculator.py +1 -1
tools/document_retriever.py +1 -1
tools/duckduckgo_search.py +6 -0
tools/file_parser.py +4 -1
tools/guest_info.py +20 -0
tools/hub_stats.py +18 -0
tools/image_parser.py +1 -2
tools/retriever.py +0 -80
tools/search.py +21 -66
tools/weather_info.py +28 -0

README.md CHANGED Viewed

@@ -1,39 +1,79 @@
 ---
-title: Jarvis Gaia Agent
 emoji: 🐢
 colorFrom: indigo
 colorTo: green
 sdk: docker
 pinned: false
 license: mit
-short_description: The JARVIS (Just A Rather Very Intelligent System) project
 ---
-# Jarvis Gaia Agent
-A Python-based AI agent leveraging `langchain`, `duckduckgo-search`, and `pytesseract` to perform web searches, document parsing, and multi-hop query refinement. Deployed as a Hugging Face Space for interactive use.
 ## Features
-- **Web Search**: Performs asynchronous searches using DuckDuckGo.
-- **Multi-Hop Search**: Refines complex queries iteratively with OpenAI's GPT-4o.
-- **Document Parsing**: Extracts text from PDFs and images using `PyPDF2` and `pytesseract`.
-- **Modular Tools**: Includes calculator, file parser, and document retriever.
-- **Observability**: Integrated with Langfuse for monitoring.
 ## Prerequisites
 - Python 3.11
 - Tesseract OCR (`brew install tesseract` on macOS)
-- API keys for:
-  - OpenAI (`OPENAI_API_KEY`)
-  - Hugging Face (`HUGGINGFACEHUB_API_TOKEN`)
-  - Groq (`GROQ_API_KEY`)
-  - Langfuse (`LANGFUSE_PUBLIC_KEY`, `LANGFUSE_SECRET_KEY`, `LANGFUSE_HOST`)
 ## Setup
 1. **Clone the Repository**:
    ```bash
-   git clone https://github.com/your-username/jarvis_gaia_agent.git
-   cd jarvis_gaia_agent

 ---
+title: JARVIS Gaia Agent
 emoji: 🐢
 colorFrom: indigo
 colorTo: green
 sdk: docker
 pinned: false
 license: mit
+short_description: Enhanced JARVIS AI agent for GAIA benchmark
 ---
+# Evolved JARVIS Gaia Agent
+An advanced Python-based AI agent combining `langchain`, `smolagents`, SERPAPI, and OCR for web searches, file parsing, and data retrieval. Deployed as a Hugging Face Space for GAIA benchmark evaluation.
+#### Directory Structure
+```
+jarvis_gaia_agent/
+├── app.py                  # Main application with Gradio interface and agent logic
+├── state.py                # Defines JARVISState for state management
+├── retriever.py            # Guest info retriever tool
+├── tools/                  # Directory for all tools
+│   ├── __init__.py         # Exports all tools
+│   ├── search.py           # Web search tools (SERPAPI-based)
+│   ├── file_parser.py      # File parsing tool (CSV, TXT, PDF, Excel)
+│   ├── image_parser.py     # Image parsing tool (OCR)
+│   ├── calculator.py       # Calculator tool
+│   ├── document_retriever.py # Document retrieval tool
+│   ├── duckduckgo_search.py # DuckDuckGo search tool (from smolagents)
+│   ├── weather_info.py     # Weather info tool (OpenWeatherMap)
+│   ├── hub_stats.py        # Hugging Face Hub stats tool
+│   ├── guest_info.py       # Guest info retriever tool (moved from retriever.py)
+├── requirements.txt        # Python dependencies
+├── Dockerfile              # Docker configuration
+├── README.md               # Project documentation
+├── .env                    # Environment variables (not committed)
+```
 ## Features
+- **Web Search**: SERPAPI and DuckDuckGo for robust searches.
+- **File Parsing**: Handles CSV, TXT, PDF, and Excel files.
+- **Image Parsing**: OCR with `easyocr` for image-based questions.
+- **Data Retrieval**: Guest info retriever for structured data.
+- **External APIs**: Weather (OpenWeatherMap), Hugging Face Hub stats.
+- **State Management**: `langgraph` for multi-step reasoning.
+- **Exact-Match Answers**: Optimized for GAIA Level 1 questions.
 ## Prerequisites
 - Python 3.11
 - Tesseract OCR (`brew install tesseract` on macOS)
+- API keys in `.env`:
+  - `HUGGINGFACEHUB_API_TOKEN`
+  - `SERPAPI_API_KEY`
+  - `OPENWEATHERMAP_API_KEY`
+  - `SPACE_ID`
 ## Setup
 1. **Clone the Repository**:
    ```bash
+   git clone https://huggingface.co/spaces/onisj/jarvis_gaia_agent
+   cd jarvis_gaia_agent
+   ```
+2. **Set Up Environment Variables**:
+   Create a `.env` file with your API keys.
+3. **Run Locally**:
+   ```bash
+   pip install -r requirements.txt
+   python app.py
+   ```
+4. **Deploy to Hugging Face Space**:
+   - Push code to your Space.
+   - Set environment variables in Space settings.
+   - Run evaluation via Gradio interface.

__init__.py ADDED Viewed

File without changes

app.py CHANGED Viewed

@@ -1,25 +1,28 @@
 import os
-import gradio as gr
-import requests
-import aiohttp
-import asyncio
 import json
 import nest_asyncio
-from langgraph.graph import StateGraph, END
-from langgraph.checkpoint.memory import MemorySaver
-from langchain_huggingface import HuggingFacePipeline
-from transformers import pipeline
-from langchain_core.messages import SystemMessage, HumanMessage
-from tools import search_tool, multi_hop_search_tool, file_parser_tool, image_parser_tool, calculator_tool, document_retriever_tool
-from tools.search import initialize_search_tools
-from state import JARVISState
 import pandas as pd
 from dotenv import load_dotenv
-import logging
-from langfuse.callback import CallbackHandler
-# Set up logging
-logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(levelname)s - %(message)s")
 logger = logging.getLogger(__name__)
 # Apply nest_asyncio
@@ -27,252 +30,253 @@ nest_asyncio.apply()
 # Load environment variables
 load_dotenv()
 # Verify environment variables
-required_env_vars = ["SPACE_ID", "LANGFUSE_PUBLIC_KEY", "LANGFUSE_SECRET_KEY"]
-for var in required_env_vars:
-    if not os.getenv(var):
-        raise ValueError(f"Environment variable {var} is not set")
-logger.info(f"Environment variables loaded: SPACE_ID={os.getenv('SPACE_ID')[:10]}..., LANGFUSE_HOST={os.getenv('LANGFUSE_HOST', 'https://cloud.langfuse.com')}")
-# Initialize Hugging Face model
 try:
-    hf_pipeline = pipeline(
-        "text-generation",
-        model="mistralai/Mixtral-7B-Instruct-v0.1",
-        device_map="auto",
-        max_new_tokens=512,
-        do_sample=True,
-        temperature=0.7
     )
-    llm = HuggingFacePipeline(pipeline=hf_pipeline)
-    logger.info("HuggingFace model initialized: mistralai/Mixtral-7B-Instruct-v0.1")
 except Exception as e:
-    logger.error(f"Failed to initialize HuggingFace model: {e}")
     llm = None
-# Initialize search tools with LLM
 try:
-    initialize_search_tools(llm)
-    logger.info("Search tools initialized")
 except Exception as e:
-    logger.error(f"Failed to initialize search tools: {e}")
-# Initialize Langfuse
-try:
-    langfuse = CallbackHandler(
-        public_key=os.getenv("LANGFUSE_PUBLIC_KEY"),
-        secret_key=os.getenv("LANGFUSE_SECRET_KEY"),
-        host=os.getenv("LANGFUSE_HOST", "https://cloud.langfuse.com")
-    )
-    logger.info("Langfuse initialized successfully")
-except Exception as e:
-    logger.warning(f"Failed to initialize Langfuse: {e}")
-    langfuse = None
-# Initialize MemorySaver
-memory = MemorySaver()
-use_checkpointing = True
-# --- Constants ---
-DEFAULT_API_URL = "https://agents-course-unit4-scoring.hf.space/api"
-GAIA_FILE_URL = "https://api.gaia-benchmark.com/files/"
 # --- Helper Functions ---
-def log_state(task_id: str, state: JARVISState):
-    """Log intermediate state to state_log.json"""
-    try:
-        log_entry = {
-            "task_id": task_id,
-            "question": state["question"],
-            "tools_needed": state["tools_needed"],
-            "web_results": state["web_results"],
-            "file_results": state["file_results"],
-            "image_results": state["image_results"],
-            "calculation_results": state["calculation_results"],
-            "document_results": state["document_results"],
-            "answer": state["answer"]
-        }
-        with open("state_log.json", "a") as f:
-            json.dump(log_entry, f, indent=2)
-            f.write("\n")
-    except Exception as e:
-        logger.error(f"Error logging state for task {task_id}: {e}")
-async def test_gaia_api(task_id: str) -> bool:
-    """Test connectivity to GAIA file API"""
     try:
-        async with aiohttp.ClientSession() as session:
-            async with session.head(f"{GAIA_FILE_URL}{task_id}", timeout=5) as resp:
-                return resp.status in [200, 403, 404]
     except Exception as e:
-        logger.warning(f"GAIA API test failed: {e}")
-        return False
 # --- Node Functions ---
-async def parse_question(state: JARVISState) -> JARVISState:
     try:
         question = state["question"]
-        prompt = f"""Analyze this GAIA question: {question}
-        Determine which tools are needed (web_search, multi_hop_search, file_parser, image_parser, calculator, document_retriever).
-        Return a JSON list of tool names."""
         if llm:
-            response = await llm.ainvoke(prompt, config={"callbacks": [langfuse] if langfuse else []})
             try:
-                tools_needed = json.loads(response.content)
-            except json.JSONDecodeError as je:
-                logger.warning(f"Invalid JSON in LLM response for task {state['task_id']}: {je}")
-                tools_needed = ["web_search"]
         else:
-            logger.warning("No LLM available, using default tools")
-            tools_needed = ["web_search"]
-        state["tools_needed"] = tools_needed
-        log_state(state["task_id"], state)
         return state
     except Exception as e:
-        logger.error(f"Error parsing question for task {state['task_id']}: {e}")
-        state["tools_needed"] = []
-        log_state(state["task_id"], state)
         return state
 async def tool_dispatcher(state: JARVISState) -> JARVISState:
     try:
-        tools_needed = state["tools_needed"]
         updated_state = state.copy()
-        can_download_files = await test_gaia_api(updated_state["task_id"])
-        for tool in tools_needed:
             try:
-                if tool == "web_search" or tool == "multi_hop_search":
-                    result = await web_search_agent(updated_state)
-                    updated_state["web_results"].extend(result["web_results"])
-                elif tool == "file_parser" and can_download_files:
-                    result = await file_parser_agent(updated_state)
-                    updated_state["file_results"] = result["file_results"]
-                elif tool == "image_parser" and can_download_files:
-                    result = await image_parser_agent(updated_state)
-                    updated_state["image_results"] = result["image_results"]
-                elif tool == "calculator":
-                    result = await calculator_agent(updated_state)
-                    updated_state["calculation_results"] = result["calculation_results"]
-                elif tool == "document_retriever" and can_download_files:
-                    result = await document_retriever_agent(updated_state)
-                    updated_state["document_results"] = result["document_results"]
             except Exception as e:
-                logger.warning(f"Error in tool {tool} for task {updated_state['task_id']}: {e}")
-        log_state(updated_state["task_id"], updated_state)
         return updated_state
     except Exception as e:
-        logger.error(f"Error in tool dispatcher for task {state['task_id']}: {e}")
-        log_state(state["task_id"], state)
         return state
-async def web_search_agent(state: JARVISState) -> JARVISState:
-    try:
-        results = []
-        if "web_search" in state["tools_needed"]:
-            result = await search_tool.invoke({"query": state["question"]})
-            results.append(result)
-        if "multi_hop_search" in state["tools_needed"]:
-            result = await multi_hop_search_tool.invoke({"query": state["question"], "steps": 3})
-            results.append(result)
-        return {"web_results": results}
-    except Exception as e:
-        logger.error(f"Error in web search for task {state['task_id']}: {e}")
-        return {"web_results": []}
-async def file_parser_agent(state: JARVISState) -> JARVISState:
-    try:
-        if "file_parser" in state["tools_needed"]:
-            file_type = "csv" if "data" in state["question"].lower() else "txt"
-            result = await file_parser_tool.aparse(state["task_id"], file_type=file_type)
-            return {"file_results": result}
-        return {"file_results": ""}
-    except Exception as e:
-        logger.error(f"Error in file parser for task {state['task_id']}: {e}")
-        return {"file_results": "File parsing failed"}
-async def image_parser_agent(state: JARVISState) -> JARVISState:
-    try:
-        if "image_parser" in state["tools_needed"]:
-            task = "match" if "fruits" in state["question"].lower() else "describe"
-            match_query = "fruits" if task == "match" else ""
-            file_path = f"temp_{state['task_id']}.jpg"
-            if not os.path.exists(file_path):
-                logger.warning(f"Image file not found for task {state['task_id']}")
-                return {"image_results": "Image file not found"}
-            result = await image_parser_tool.aparse(
-                file_path, task=task, match_query=match_query
-            )
-            return {"image_results": result}
-        return {"image_results": ""}
-    except Exception as e:
-        logger.error(f"Error in image parser for task {state['task_id']}: {e}")
-        return {"image_results": "Image parsing failed"}
-async def calculator_agent(state: JARVISState) -> JARVISState:
-    try:
-        if "calculator" in state["tools_needed"]:
-            prompt = f"Extract a mathematical expression from: {state['question']}\n{state['file_results']}"
-            if llm:
-                response = await llm.ainvoke(prompt, config={"callbacks": [langfuse] if langfuse else []})
-                expression = response.content
-            else:
-                expression = "0"
-            result = await calculator_tool.aparse(expression)
-            return {"calculation_results": result}
-        return {"calculation_results": ""}
-    except Exception as e:
-        logger.error(f"Error in calculator for task {state['task_id']}: {e}")
-        return {"calculation_results": "Calculation failed"}
-async def document_retriever_agent(state: JARVISState) -> JARVISState:
-    try:
-        if "document_retriever" in state["tools_needed"]:
-            file_type = "txt" if "menu" in state["question"].lower() else "csv"
-            if "report" in state["question"].lower() or "document" in state["question"].lower():
-                file_type = "pdf"
-            result = await document_retriever_tool.aparse(
-                state["task_id"], state["question"], file_type=file_type
-            )
-            return {"document_results": result}
-        return {"document_results": ""}
-    except Exception as e:
-        logger.error(f"Error in document retriever for task {state['task_id']}: {e}")
-        return {"document_results": "Document retrieval failed"}
-async def reasoning_agent(state: JARVISState) -> JARVISState:
     try:
-        prompt = f"""Question: {state['question']}
-        Web Results: {state['web_results']}
-        File Results: {state['file_results']}
-        Image Results: {state['image_results']}
-        Calculation Results: {state['calculation_results']}
-        Document Results: {state['document_results']}
-        Synthesize an exact-match answer for the GAIA benchmark.
-        Output only the answer (e.g., '90', 'White;5876')."""
-        if llm:
-            response = await llm.ainvoke(
-                [
-                    SystemMessage(content="You are JARVIS, a precise assistant for the GAIA benchmark. Provide exact answers only."),
-                    HumanMessage(content=prompt)
-                ],
-                config={"callbacks": [langfuse] if langfuse else []}
-            )
-            answer = response.content.strip()
-        else:
-            answer = "Unknown"
-        state["answer"] = answer
-        log_state(state["task_id"], state)
-        return state
     except Exception as e:
-        logger.error(f"Error in reasoning for task {state['task_id']}: {e}")
-        state["answer"] = "Error in reasoning"
-        log_state(state["task_id"], state)
-        return state
 def router(state: JARVISState) -> str:
     if state["tools_needed"]:
         return "tool_dispatcher"
     return "reasoning"
@@ -281,8 +285,7 @@ def router(state: JARVISState) -> str:
 workflow = StateGraph(JARVISState)
 workflow.add_node("parse", parse_question)
 workflow.add_node("tool_dispatcher", tool_dispatcher)
-workflow.add_node("reasoning", reasoning_agent)
 workflow.set_entry_point("parse")
 workflow.add_conditional_edges(
     "parse",
@@ -294,97 +297,95 @@ workflow.add_conditional_edges(
 )
 workflow.add_edge("tool_dispatcher", "reasoning")
 workflow.add_edge("reasoning", END)
-# Compile graph
-graph = workflow.compile(checkpointer=memory if use_checkpointing else None)
-# --- Basic Agent Definition ---
 class BasicAgent:
     def __init__(self):
         logger.info("BasicAgent initialized.")
     async def process_question(self, task_id: str, question: str) -> str:
         file_type = "jpg" if "image" in question.lower() else "txt"
-        if "menu" in question.lower() or "report" in question.lower() or "document" in question.lower():
             file_type = "pdf"
         elif "data" in question.lower():
-            file_type = "csv"
         file_path = f"temp_{task_id}.{file_type}"
-        if await test_gaia_api(task_id):
             try:
                 async with aiohttp.ClientSession() as session:
-                    async with session.get(f"{GAIA_FILE_URL}{task_id}") as resp:
                         if resp.status == 200:
                             with open(file_path, "wb") as f:
                                 f.write(await resp.read())
                         else:
-                            logger.warning(f"Failed to download file for task {task_id}: HTTP {resp.status}")
             except Exception as e:
-                logger.error(f"Error downloading file for task {task_id}: {e}")
         state = JARVISState(
             task_id=task_id,
             question=question,
-            tools_needed=[],
             web_results=[],
             file_results="",
             image_results="",
             calculation_results="",
             document_results="",
-            messages=[],
             answer=""
         )
         try:
-            config = {"configurable": {"thread_id": task_id}} if use_checkpointing else {}
-            result = await graph.ainvoke(state, config=config)
-            return result["answer"] or "No answer generated"
         except Exception as e:
             logger.error(f"Error processing task {task_id}: {e}")
             return f"Error: {str(e)}"
         finally:
-            if os.path.exists(file_path):
-                try:
-                    os.remove(file_path)
-                except Exception as e:
-                    logger.error(f"Error removing file {file_path}: {e}")
     async def async_call(self, question: str, task_id: str) -> str:
-        return await self.process_question(task_id, question)
     def __call__(self, question: str, task_id: str = None) -> str:
-        logger.info(f"Agent received question (first 50 chars): {question[:50]}...")
         if task_id is None:
-            logger.warning("task_id not provided, using placeholder")
-            task_id = "placeholder_task_id"
         try:
-            try:
-                loop = asyncio.get_event_loop()
-            except RuntimeError:
-                loop = asyncio.new_event_loop()
-                asyncio.set_event_loop(loop)
-            return loop.run_until_complete(self.async_call(question, task_id))
-        finally:
-            pass
-# --- Main Function ---
 def run_and_submit_all(profile: gr.OAuthProfile | None):
-    space_id = os.getenv("SPACE_ID")
     if not profile:
         logger.error("User not logged in.")
-        return "Please Login to Hugging Face with the button.", None
     username = f"{profile.username}"
     logger.info(f"User logged in: {username}")
-    api_url = DEFAULT_API_URL
-    questions_url = f"{api_url}/questions"
-    submit_url = f"{api_url}/submit"
-    agent_code = f"https://huggingface.co/spaces/{space_id}/tree/main"
     try:
         agent = BasicAgent()
     except Exception as e:
-        logger.error(f"Error instantiating agent: {e}")
         return f"Error initializing agent: {e}", None
     logger.info(f"Fetching questions from: {questions_url}")
@@ -393,8 +394,8 @@ def run_and_submit_all(profile: gr.OAuthProfile | None):
         response.raise_for_status()
         questions_data = response.json()
         if not questions_data:
-            logger.error("Fetched questions list is empty.")
-            return "Fetched questions list is empty or invalid format.", None
         logger.info(f"Fetched {len(questions_data)} questions.")
     except Exception as e:
         logger.error(f"Error fetching questions: {e}")
@@ -402,24 +403,24 @@ def run_and_submit_all(profile: gr.OAuthProfile | None):
     results_log = []
     answers_payload = []
-    logger.info(f"Running agent on {len(questions_data)} questions...")
     for item in questions_data:
         task_id = item.get("task_id")
         question_text = item.get("question")
         if not task_id or question_text is None:
-            logger.warning(f"Skipping item with missing task_id or question: {item}")
             continue
         try:
             submitted_answer = agent(question_text, task_id)
             answers_payload.append({"task_id": task_id, "submitted_answer": submitted_answer})
             results_log.append({"Task ID": task_id, "Question": question_text, "Submitted Answer": submitted_answer})
         except Exception as e:
-            logger.error(f"Error running agent on task {task_id}: {e}")
             results_log.append({"Task ID": task_id, "Question": question_text, "Submitted Answer": f"AGENT ERROR: {e}"})
     if not answers_payload:
-        logger.error("Agent did not produce any answers to submit.")
-        return "Agent did not produce any answers to submit.", pd.DataFrame(results_log)
     submission_data = {"username": username.strip(), "agent_code": agent_code, "answers": answers_payload}
     logger.info(f"Submitting {len(answers_payload)} answers to: {submit_url}")
@@ -427,7 +428,6 @@ def run_and_submit_all(profile: gr.OAuthProfile | None):
         response = requests.post(submit_url, json=submission_data, timeout=120)
         response.raise_for_status()
         result_data = response.json()
-        logger.info(f"Server response: {result_data}")
         final_status = (
             f"Submission Successful!\n"
             f"User: {result_data.get('username')}\n"
@@ -442,19 +442,19 @@ def run_and_submit_all(profile: gr.OAuthProfile | None):
         results_df = pd.DataFrame(results_log)
         return f"Submission Failed: {e}", results_df
-# --- Build Gradio Interface ---
 with gr.Blocks() as demo:
-    gr.Markdown("# JARVIS Agent Evaluation Runner")
     gr.Markdown(
         """
         **Instructions:**
-        1. Log in to your Hugging Face account using the button below.
-        2. Click 'Run Evaluation & Submit All Answers' to fetch questions, run the JARVIS agent, and submit answers.
         ---
         **Disclaimers:**
-        The agent uses a local Hugging Face model (Mixtral-7B) and async tools for the GAIA benchmark.
         """
     )
@@ -463,16 +463,16 @@ with gr.Blocks() as demo:
     run_button = gr.Button("Run Evaluation & Submit All Answers")
     status_output = gr.Textbox(label="Run Status / Submission Result", lines=5, interactive=False)
-    results_table = gr.DataFrame(label="Questions and Agent Answers", wrap=True)
     run_button.click(
         fn=run_and_submit_all,
         outputs=[status_output, results_table]
     )
 if __name__ == "__main__":
     logger.info("\n" + "-"*30 + " App Starting " + "-"*30)
-    space_id = os.getenv("SPACE_ID")
-    logger.info(f"SPACE_ID: {space_id}")
     logger.info("Launching Gradio Interface...")
     demo.launch(debug=True, share=False)

 import os
 import json
+import logging
+import asyncio
+import aiohttp
 import nest_asyncio
+import requests
 import pandas as pd
+from typing import Dict, Any, List
+from langchain_core.prompts import ChatPromptTemplate
+from langchain_core.messages import SystemMessage, HumanMessage
+from langgraph.graph import StateGraph, END
+from sentence_transformers import SentenceTransformer
+import gradio as gr
 from dotenv import load_dotenv
+from huggingface_hub import InferenceClient
+from state import JARVISState
+from tools import (
+    search_tool, multi_hop_search_tool, file_parser_tool, image_parser_tool,
+    calculator_tool, document_retriever_tool, duckduckgo_search_tool,
+    weather_info_tool, hub_stats_tool, guest_info_retriever_tool
+)
+# Setup logging
+logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
 logger = logging.getLogger(__name__)
 # Apply nest_asyncio
 # Load environment variables
 load_dotenv()
+SPACE_ID = os.getenv("SPACE_ID", "onisj/jarvis_gaia_agent")
+GAIA_API_URL = "https://agents-course-unit4-scoring.hf.space"
+GAIA_FILE_URL = f"{GAIA_API_URL}/files/"
+HF_TOKEN = os.getenv("HUGGINGFACEHUB_API_TOKEN")
 # Verify environment variables
+if not SPACE_ID:
+    raise ValueError("SPACE_ID not set")
+if not HF_TOKEN:
+    raise ValueError("HUGGINGFACEHUB_API_TOKEN not set")
+logger.info(f"SPACE_ID: {SPACE_ID}")
+# Initialize models
 try:
+    llm = InferenceClient(
+        model="meta-llama/Meta-Llama-3-8B-Instruct",
+        token=HF_TOKEN,
+        timeout=30
     )
+    logger.info("Hugging Face Inference LLM initialized")
 except Exception as e:
+    logger.error(f"Failed to initialize LLM: {e}")
     llm = None
 try:
+    embedder = SentenceTransformer("all-MiniLM-L6-v2")
+    logger.info("Sentence transformer initialized")
 except Exception as e:
+    logger.error(f"Failed to initialize embedder: {e}")
+    embedder = None
 # --- Helper Functions ---
+async def test_gaia_api(task_id: str, file_type: str = "txt") -> tuple[bool, str | None]:
+    """Test if a file exists for the task ID."""
     try:
+        for ext in [file_type, "txt", "csv", "xlsx", "jpg", "pdf"]:
+            async with aiohttp.ClientSession() as session:
+                async with session.get(f"{GAIA_FILE_URL}{task_id}.{ext}", timeout=5) as resp:
+                    logger.info(f"GAIA API test for task {task_id} with .{ext}: HTTP {resp.status}")
+                    if resp.status == 200:
+                        file_path = f"temp_{task_id}.{ext}"
+                        with open(file_path, "wb") as f:
+                            f.write(await resp.read())
+                        return True, ext
+        logger.info(f"No file found for task {task_id}")
+        return False, None
     except Exception as e:
+        logger.warning(f"GAIA API test failed: {str(e)}")
+        return False, None
 # --- Node Functions ---
+async def parse_question(state: Dict[str, Any]) -> Dict[str, Any]:
+    """Parse the question to select appropriate tools."""
     try:
         question = state["question"]
+        task_id = state["task_id"]
+        tools_needed = ["search_tool"]
         if llm:
+            prompt = ChatPromptTemplate.from_messages([
+                SystemMessage(content="""Select tools from: ['search_tool', 'multi_hop_search_tool', 'file_parser_tool', 'image_parser_tool', 'calculator_tool', 'document_retriever_tool', 'duckduckgo_search_tool', 'weather_info_tool', 'hub_stats_tool', 'guest_info_retriever_tool'].
+                Return JSON list, e.g., ["search_tool", "file_parser_tool"].
+                Rules:
+                - Always include "search_tool" unless purely computational.
+                - Use "multi_hop_search_tool" for complex queries (over 20 words).
+                - Use "file_parser_tool" for data, tables, or Excel.
+                - Use "image_parser_tool" for images/videos.
+                - Use "calculator_tool" for math calculations.
+                - Use "document_retriever_tool" for documents/PDFs.
+                - Use "duckduckgo_search_tool" for additional search capability.
+                - Use "weather_info_tool" for weather-related queries.
+                - Use "hub_stats_tool" for Hugging Face Hub queries.
+                - Use "guest_info_retriever_tool" for guest-related queries.
+                - Output ONLY valid JSON."""),
+                HumanMessage(content=f"Query: {question}")
+            ])
             try:
+                response = llm.chat_completion(
+                    messages=[
+                        {"role": "system", "content": prompt[0].content},
+                        {"role": "user", "content": prompt[1].content}
+                    ],
+                    max_tokens=512,
+                    temperature=0.7
+                )
+                tools_needed = json.loads(response["choices"][0]["message"]["content"].strip())
+                valid_tools = {
+                    "search_tool", "multi_hop_search_tool", "file_parser_tool", "image_parser_tool",
+                    "calculator_tool", "document_retriever_tool", "duckduckgo_search_tool",
+                    "weather_info_tool", "hub_stats_tool", "guest_info_retriever_tool"
+                }
+                tools_needed = [tool for tool in tools_needed if tool in valid_tools]
+            except Exception as e:
+                logger.warning(f"Task {task_id} failed: JSON parse error: {e}")
+                tools_needed = ["search_tool"]
+        # Keyword-based fallback
+        question_lower = question.lower()
+        if any(word in question_lower for word in ["image", "video"]):
+            tools_needed.append("image_parser_tool")
+        if any(word in question_lower for word in ["data", "table", "excel"]):
+            tools_needed.append("file_parser_tool")
+        if any(word in question_lower for word in ["calculate", "math"]):
+            tools_needed.append("calculator_tool")
+        if any(word in question_lower for word in ["document", "pdf"]):
+            tools_needed.append("document_retriever_tool")
+        if any(word in question_lower for word in ["weather"]):
+            tools_needed.append("weather_info_tool")
+        if any(word in question_lower for word in ["model", "huggingface"]):
+            tools_needed.append("hub_stats_tool")
+        if any(word in question_lower for word in ["guest", "name", "relation"]):
+            tools_needed.append("guest_info_retriever_tool")
+        if len(question.split()) > 20:
+            tools_needed.append("multi_hop_search_tool")
+        file_available, file_ext = await test_gaia_api(task_id)
+        if file_available:
+            if "file_parser_tool" not in tools_needed and any(word in question_lower for word in ["data", "table", "excel"]):
+                tools_needed.append("file_parser_tool")
+            if "image_parser_tool" not in tools_needed and "image" in question_lower:
+                tools_needed.append("image_parser_tool")
+            if "document_retriever_tool" not in tools_needed and file_ext == "pdf":
+                tools_needed.append("document_retriever_tool")
         else:
+            tools_needed = [tool for tool in tools_needed if tool not in ["file_parser_tool", "image_parser_tool", "document_retriever_tool"]]
+        state["tools_needed"] = list(set(tools_needed))  # Remove duplicates
+        logger.info(f"Task {task_id}: Selected tools: {tools_needed}")
         return state
     except Exception as e:
+        logger.error(f"Error parsing task {task_id}: {e}")
+        state["tools_needed"] = ["search_tool"]
         return state
 async def tool_dispatcher(state: JARVISState) -> JARVISState:
+    """Dispatch selected tools to process the state."""
     try:
         updated_state = state.copy()
+        file_type = "jpg" if "image" in state["question"].lower() else "txt"
+        if "menu" in state["question"].lower() or "report" in state["question"].lower():
+            file_type = "pdf"
+        elif "data" in state["question"].lower():
+            file_type = "xlsx"
+        can_download, file_ext = await test_gaia_api(updated_state["task_id"], file_type)
+        for tool in updated_state["tools_needed"]:
             try:
+                if tool == "search_tool":
+                    result = await search_tool.ainvoke({"query": updated_state["question"]})
+                    updated_state["web_results"].extend([r["content"] for r in result])
+                elif tool == "multi_hop_search_tool":
+                    result = await multi_hop_search_tool.ainvoke({"query": updated_state["question"], "steps": 3})
+                    updated_state["web_results"].extend([r["content"] for r in result])
+                    await asyncio.sleep(2)  # Rate limit
+                elif tool == "file_parser_tool" and can_download:
+                    result = await file_parser_tool.ainvoke({"task_id": updated_state["task_id"], "file_type": file_ext})
+                    updated_state["file_results"] = str(result)
+                elif tool == "image_parser_tool" and can_download:
+                    result = await image_parser_tool.ainvoke({
+                        "file_path": f"temp_{updated_state['task_id']}.{file_ext}",
+                        "task": "describe"
+                    })
+                    updated_state["image_results"] = str(result)
+                elif tool == "calculator_tool":
+                    result = await calculator_tool.ainvoke({"expression": updated_state.get("question", "")})
+                    updated_state["calculation_results"] = str(result)
+                elif tool == "document_retriever_tool" and can_download:
+                    result = await document_retriever_tool.ainvoke({
+                        "task_id": updated_state["task_id"],
+                        "query": updated_state["question"],
+                        "file_type": file_ext
+                    })
+                    updated_state["document_results"] = str(result)
+                elif tool == "duckduckgo_search_tool":
+                    result = await duckduckgo_search_tool.run(updated_state["question"])
+                    updated_state["web_results"].append(str(result))
+                elif tool == "weather_info_tool":
+                    location = updated_state["question"].split("weather in ")[1].split()[0] if "weather in" in updated_state["question"].lower() else "Unknown"
+                    result = await weather_info_tool.ainvoke({"location": location})
+                    updated_state["web_results"].append(str(result))
+                elif tool == "hub_stats_tool":
+                    author = updated_state["question"].split("by ")[1].split()[0] if "by" in updated_state["question"].lower() else "Unknown"
+                    result = await hub_stats_tool.ainvoke({"author": author})
+                    updated_state["web_results"].append(str(result))
+                elif tool == "guest_info_retriever_tool":
+                    query = updated_state["question"].split("about ")[1] if "about" in updated_state["question"].lower() else updated_state["question"]
+                    result = await guest_info_retriever_tool.ainvoke({"query": query})
+                    updated_state["web_results"].append(str(result))
             except Exception as e:
+                logger.warning(f"Error in tool {tool} for task {updated_state['task_id']}: {str(e)}")
+                updated_state[f"{tool}_results"] = f"Error: {str(e)}"
+        logger.info(f"Task {updated_state['task_id']}: Tool results: {updated_state}")
         return updated_state
     except Exception as e:
+        logger.error(f"Tool dispatch failed for task {state['task_id']}: {e}")
         return state
+async def reasoning(state: JARVISState) -> Dict[str, Any]:
+    """Generate exact-match answer with specific formatting."""
     try:
+        if not llm:
+            return {"answer": "LLM unavailable"}
+        prompt = ChatPromptTemplate.from_messages([
+            SystemMessage(content="""Provide ONLY the exact answer (e.g., '90', 'HUE'). For USD, use two decimal places (e.g., '1234.00'). For lists, use comma-separated values (e.g., 'Smith, Lee'). For IOC codes, use three-letter codes (e.g., 'ARG'). No explanations or conversational text."""),
+            HumanMessage(content="""Question: {question}
+Web results: {web_results}
+File results: {file_results}
+Image results: {image_results}
+Calculation results: {calculation_results}
+Document results: {document_results}""")
+        ])
+        response = llm.chat_completion(
+            messages=[
+                {"role": "system", "content": prompt[0].content},
+                {"role": "user", "content": prompt[1].content.format(
+                    question=state["question"],
+                    web_results="\n".join(state["web_results"]),
+                    file_results=state["file_results"],
+                    image_results=state["image_results"],
+                    calculation_results=state["calculation_results"],
+                    document_results=state["document_results"]
+                )}
+            ],
+            max_tokens=512,
+            temperature=0.7
+        )
+        answer = response["choices"][0]["message"]["content"].strip()
+        # Clean answer for specific formats
+        if "USD" in state["question"].lower():
+            try:
+                answer = f"{float(answer):.2f}"
+            except ValueError:
+                pass
+        if "before and after" in state["question"].lower():
+            answer = answer.replace(" and ", ", ")
+        elif "IOC code" in state["question"].lower():
+            answer = answer.upper()[:3]
+        logger.info(f"Task {state['task_id']}: Answer: {answer}")
+        return {"answer": answer}
     except Exception as e:
+        logger.error(f"Reasoning failed for task {state['task_id']}: {e}")
+        return {"answer": f"Error: {str(e)}"}
 def router(state: JARVISState) -> str:
+    """Route based on tools needed."""
     if state["tools_needed"]:
         return "tool_dispatcher"
     return "reasoning"
 workflow = StateGraph(JARVISState)
 workflow.add_node("parse", parse_question)
 workflow.add_node("tool_dispatcher", tool_dispatcher)
+workflow.add_node("reasoning", reasoning)
 workflow.set_entry_point("parse")
 workflow.add_conditional_edges(
     "parse",
 )
 workflow.add_edge("tool_dispatcher", "reasoning")
 workflow.add_edge("reasoning", END)
+graph = workflow.compile()
+# --- Basic Agent ---
 class BasicAgent:
     def __init__(self):
         logger.info("BasicAgent initialized.")
     async def process_question(self, task_id: str, question: str) -> str:
+        """Process a single question with file handling."""
         file_type = "jpg" if "image" in question.lower() else "txt"
+        if "menu" in question.lower() or "report" in question.lower():
             file_type = "pdf"
         elif "data" in question.lower():
+            file_type = "xlsx"
         file_path = f"temp_{task_id}.{file_type}"
+        file_available, file_ext = await test_gaia_api(task_id, file_type)
+        if file_available:
             try:
                 async with aiohttp.ClientSession() as session:
+                    async with session.get(f"{GAIA_FILE_URL}{task_id}.{file_ext}") as resp:
                         if resp.status == 200:
                             with open(file_path, "wb") as f:
                                 f.write(await resp.read())
                         else:
+                            logger.warning(f"Failed to fetch file for {task_id}: HTTP {resp.status}")
             except Exception as e:
+                logger.error(f"Error downloading file for task {task_id}: {str(e)}")
         state = JARVISState(
             task_id=task_id,
             question=question,
+            tools_needed=["search_tool"],
             web_results=[],
             file_results="",
             image_results="",
             calculation_results="",
             document_results="",
+            messages=[HumanMessage(content=question)],
             answer=""
         )
         try:
+            result = await graph.ainvoke(state)
+            answer = result["answer"] or "Unknown"
+            logger.info(f"Task {task_id}: Final answer generated: {answer}")
+            return answer
         except Exception as e:
             logger.error(f"Error processing task {task_id}: {e}")
             return f"Error: {str(e)}"
         finally:
+            for ext in ["txt", "csv", "xlsx", "jpg", "pdf"]:
+                file_path = f"temp_{task_id}.{ext}"
+                if os.path.exists(file_path):
+                    try:
+                        os.remove(file_path)
+                    except Exception as e:
+                        logger.error(f"Error removing file {file_path}: {e}")
     async def async_call(self, question: str, task_id: str) -> str:
+        return await self.process_question(question, task_id)
     def __call__(self, question: str, task_id: str = None) -> str:
+        logger.info(f"Processing question: {question[:50]}...")
         if task_id is None:
+            task_id = "unknown_task_id"
         try:
+            loop = asyncio.get_event_loop()
+        except RuntimeError:
+            loop = asyncio.new_event_loop()
+            asyncio.set_event_loop(loop)
+        return loop.run_until_complete(self.async_call(question, task_id))
+# --- Evaluation and Submission ---
 def run_and_submit_all(profile: gr.OAuthProfile | None):
+    """Run evaluation and submit answers to GAIA API."""
     if not profile:
         logger.error("User not logged in.")
+        return "Please Login to Hugging Face.", None
     username = f"{profile.username}"
     logger.info(f"User logged in: {username}")
+    questions_url = f"{GAIA_API_URL}/questions"
+    submit_url = f"{GAIA_API_URL}/submit"
+    agent_code = f"https://huggingface.co/spaces/{SPACE_ID}/tree/main"
     try:
         agent = BasicAgent()
     except Exception as e:
+        logger.error(f"Agent initialization failed: {e}")
         return f"Error initializing agent: {e}", None
     logger.info(f"Fetching questions from: {questions_url}")
         response.raise_for_status()
         questions_data = response.json()
         if not questions_data:
+            logger.error("Empty questions list.")
+            return "No questions fetched.", None
         logger.info(f"Fetched {len(questions_data)} questions.")
     except Exception as e:
         logger.error(f"Error fetching questions: {e}")
     results_log = []
     answers_payload = []
+    logger.info(f"Processing {len(questions_data)} questions...")
     for item in questions_data:
         task_id = item.get("task_id")
         question_text = item.get("question")
         if not task_id or question_text is None:
+            logger.warning(f"Skipping invalid item: {item}")
             continue
         try:
             submitted_answer = agent(question_text, task_id)
             answers_payload.append({"task_id": task_id, "submitted_answer": submitted_answer})
             results_log.append({"Task ID": task_id, "Question": question_text, "Submitted Answer": submitted_answer})
         except Exception as e:
+            logger.error(f"Error for task {task_id}: {e}")
             results_log.append({"Task ID": task_id, "Question": question_text, "Submitted Answer": f"AGENT ERROR: {e}"})
     if not answers_payload:
+        logger.error("No answers generated.")
+        return "No answers to submit.", pd.DataFrame(results_log)
     submission_data = {"username": username.strip(), "agent_code": agent_code, "answers": answers_payload}
     logger.info(f"Submitting {len(answers_payload)} answers to: {submit_url}")
         response = requests.post(submit_url, json=submission_data, timeout=120)
         response.raise_for_status()
         result_data = response.json()
         final_status = (
             f"Submission Successful!\n"
             f"User: {result_data.get('username')}\n"
         results_df = pd.DataFrame(results_log)
         return f"Submission Failed: {e}", results_df
+# --- Gradio Interface ---
 with gr.Blocks() as demo:
+    gr.Markdown("# Evolved JARVIS Agent Evaluation")
     gr.Markdown(
         """
         **Instructions:**
+        1. Log in to Hugging Face using the button below.
+        2. Click 'Run Evaluation & Submit All Answers' to process GAIA questions and submit.
         ---
         **Disclaimers:**
+        Uses Hugging Face Inference, SERPAPI, and OpenWeatherMap for GAIA benchmark.
         """
     )
     run_button = gr.Button("Run Evaluation & Submit All Answers")
     status_output = gr.Textbox(label="Run Status / Submission Result", lines=5, interactive=False)
+    results_table = gr.DataFrame(label="Questions and Answers", wrap=True)
     run_button.click(
         fn=run_and_submit_all,
         outputs=[status_output, results_table]
     )
+# --- Main ---
 if __name__ == "__main__":
     logger.info("\n" + "-"*30 + " App Starting " + "-"*30)
+    logger.info(f"SPACE_ID: {SPACE_ID}")
     logger.info("Launching Gradio Interface...")
     demo.launch(debug=True, share=False)

dockerfile CHANGED Viewed

@@ -2,23 +2,20 @@ FROM python:3.11-slim
 WORKDIR /app
-# Install system dependencies
 RUN apt-get update && apt-get install -y \
     libgl1-mesa-glx \
     libglib2.0-0 \
     tesseract-ocr \
     libtesseract-dev \
     && rm -rf /var/lib/apt/lists/*
-# Copy project files
 COPY requirements.txt .
 COPY app.py .
-COPY graph.py .
 COPY state.py .
 COPY tools/ tools/
-# Install Python dependencies
 RUN pip install --no-cache-dir -r requirements.txt
-# Run the application
-CMD ["python", "app.py"]

 WORKDIR /app
 RUN apt-get update && apt-get install -y \
     libgl1-mesa-glx \
     libglib2.0-0 \
+    python3-dev \
     tesseract-ocr \
     libtesseract-dev \
     && rm -rf /var/lib/apt/lists/*
 COPY requirements.txt .
 COPY app.py .
 COPY state.py .
+COPY retriever.py .
 COPY tools/ tools/
 RUN pip install --no-cache-dir -r requirements.txt
+CMD ["python3", "app.py"]

graph.py DELETED Viewed

@@ -1,143 +0,0 @@
-from langgraph.graph import StateGraph, END
-from langgraph.checkpoint.memory import MemorySaver
-from state import JARVISState
-from langchain_openai import ChatOpenAI
-from langchain_core.messages import SystemMessage, HumanMessage
-from tools import search_tool, multi_hop_search_tool, file_parser_tool, image_parser_tool, calculator_tool, document_retriever_tool
-from langfuse.callback import LangfuseCallbackHandler
-import json
-import os
-from dotenv import load_dotenv
-# Load environment variables
-load_dotenv()
-# Debug: Verify environment variables
-print(f"OPENAI_API_KEY loaded in graph.py: {'set' if os.getenv('OPENAI_API_KEY') else 'not set'}")
-print(f"LANGFUSE_PUBLIC_KEY loaded in graph.py: {'set' if os.getenv('LANGFUSE_PUBLIC_KEY') else 'not set'}")
-# Initialize LLM and Langfuse
-api_key = os.getenv("OPENAI_API_KEY")
-if not api_key:
-    raise ValueError("OPENAI_API_KEY environment variable not set")
-llm = ChatOpenAI(model="gpt-4o", api_key=api_key)
-langfuse = LangfuseCallbackHandler(
-    public_key=os.getenv("LANGFUSE_PUBLIC_KEY"),
-    secret_key=os.getenv("LANGFUSE_SECRET_KEY"),
-    host=os.getenv("LANGFUSE_HOST")
-)
-memory = MemorySaver()
-# Question Parser Node
-async def parse_question(state: JARVISState) -> JARVISState:
-    question = state["question"]
-    prompt = f"""Analyze this GAIA question: {question}
-    Determine which tools are needed (web_search, multi_hop_search, file_parser, image_parser, calculator, document_retriever).
-    Return a JSON list of tool names."""
-    response = await llm.ainvoke(prompt, config={"callbacks": [langfuse]})
-    tools_needed = json.loads(response.content)
-    return {"messages": state["messages"] + [response], "tools_needed": tools_needed}
-# Web Search Agent Node
-async def web_search_agent(state: JARVISState) -> JARVISState:
-    results = []
-    if "web_search" in state["tools_needed"]:
-        result = await search_tool.arun(state["question"])
-        results.append(result)
-    if "multi_hop_search" in state["tools_needed"]:
-        result = await multi_hop_search_tool.aparse(state["question"], steps=3)
-        results.append(result)
-    return {"web_results": results}
-# File Parser Agent Node
-async def file_parser_agent(state: JARVISState) -> JARVISState:
-    if "file_parser" in state["tools_needed"]:
-        result = await file_parser_tool.aparse(state["task_id"])
-        return {"file_results": result}
-    return {"file_results": ""}
-# Image Parser Agent Node
-async def image_parser_agent(state: JARVISState) -> JARVISState:
-    if "image_parser" in state["tools_needed"]:
-        task = "match" if "fruits" in state["question"].lower() else "describe"
-        match_query = "fruits" if task == "match" else ""
-        result = await image_parser_tool.aparse(
-            f"temp_{state['task_id']}.jpg", task=task, match_query=match_query
-        )
-        return {"image_results": result}
-    return {"image_results": ""}
-# Calculator Agent Node
-async def calculator_agent(state: JARVISState) -> JARVISState:
-    if "calculator" in state["tools_needed"]:
-        prompt = f"Extract a mathematical expression from: {state['question']}\n{state['file_results']}"
-        response = await llm.ainvoke(prompt, config={"callbacks": [langfuse]})
-        expression = response.content
-        result = await calculator_tool.aparse(expression)
-        return {"calculation_results": result}
-    return {"calculation_results": ""}
-# Document Retriever Agent Node
-async def document_retriever_agent(state: JARVISState) -> JARVISState:
-    if "document_retriever" in state["tools_needed"]:
-        file_type = "txt" if "menu" in state["question"].lower() else "csv"
-        if "report" in state["question"].lower() or "document" in state["question"].lower():
-            file_type = "pdf"
-        result = await document_retriever_tool.aparse(
-            state["task_id"], state["question"], file_type=file_type
-        )
-        return {"document_results": result}
-    return {"document_results": ""}
-# Reasoning Agent Node
-async def reasoning_agent(state: JARVISState) -> JARVISState:
-    prompt = f"""Question: {state['question']}
-    Web Results: {state['web_results']}
-    File Results: {state['file_results']}
-    Image Results: {state['image_results']}
-    Calculation Results: {state['calculation_results']}
-    Document Results: {state['document_results']}
-    Synthesize an exact-match answer for the GAIA benchmark.
-    Output only the answer (e.g., '90', 'White;5876')."""
-    response = await llm.ainvoke(
-        [
-            SystemMessage(content="You are JARVIS, a precise assistant for the GAIA benchmark. Provide exact answers only."),
-            HumanMessage(content=prompt)
-        ],
-        config={"callbacks": [langfuse]}
-    )
-    return {"answer": response.content, "messages": state["messages"] + [response]}
-# Conditional Edge Router
-def router(state: JARVISState) -> str:
-    if state["tools_needed"]:
-        return "tools"
-    return "reasoning"
-# Build Graph
-workflow = StateGraph(JARVISState)
-workflow.add_node("parse", parse_question)
-workflow.add_node("web_search", web_search_agent)
-workflow.add_node("file_parser", file_parser_agent)
-workflow.add_node("image_parser", image_parser_agent)
-workflow.add_node("calculator", calculator_agent)
-workflow.add_node("document_retriever", document_retriever_agent)
-workflow.add_node("reasoning", reasoning_agent)
-workflow.set_entry_point("parse")
-workflow.add_conditional_edges(
-    "parse",
-    router,
-    {
-        "tools": ["web_search", "file_parser", "image_parser", "calculator", "document_retriever"],
-        "reasoning": "reasoning"
-    }
-)
-workflow.add_edge("web_search", "reasoning")
-workflow.add_edge("file_parser", "reasoning")
-workflow.add_edge("image_parser", "reasoning")
-workflow.add_edge("calculator", "reasoning")
-workflow.add_edge("document_retriever", "reasoning")
-workflow.add_edge("reasoning", END)
-graph = workflow.compile(checkpointer=memory)

project_struct.txt ADDED Viewed

	@@ -0,0 +1,21 @@

+jarvis_gaia_agent/
+├── app.py                  # Main application with Gradio interface and agent logic
+├── state.py                # Defines JARVISState for state management
+├── retriever.py            # Guest info retriever tool
+├── tools/                  # Directory for all tools
+│   ├── __init__.py         # Exports all tools
+│   ├── search.py           # Web search tools (SERPAPI-based)
+│   ├── file_parser.py      # File parsing tool (CSV, TXT, PDF, Excel)
+│   ├── image_parser.py     # Image parsing tool (OCR)
+│   ├── calculator.py       # Calculator tool
+│   ├── document_retriever.py # Document retrieval tool
+│   ├── duckduckgo_search.py # DuckDuckGo search tool (from smolagents)
+│   ├── weather_info.py     # Weather info tool (OpenWeatherMap)
+│   ├── hub_stats.py        # Hugging Face Hub stats tool
+│   ├── guest_info.py       # Guest info retriever tool (moved from retriever.py)
+├── requirements.txt        # Python dependencies
+├── Dockerfile              # Docker configuration
+├── README.md               # Project documentation
+├── .env                    # Environment variables (not committed)
+2 directories, 17 files

requirements.txt CHANGED Viewed

@@ -1,89 +1,18 @@
-aiohttp==3.8.6
-aiosignal==1.3.1
-annotated-types==0.7.0
-anyio==4.4.0
-attrs==23.2.0
-backoff==2.2.1
-certifi==2024.7.4
-charset-normalizer==3.3.2
-click==8.1.7
-dataclasses-json==0.6.7
-distro==1.9.0
-duckduckgo_search==6.2.4
-filelock==3.15.4
-frozenlist==1.4.1
-fsspec==2024.6.1
-greenlet==3.0.3
-h11==0.14.0
-httpcore==1.0.5
-httpx==0.27.0
-httpx-sse==0.4.0
-huggingface-hub==0.23.4
-idna==3.7
-Jinja2==3.1.4
-jiter==0.5.0
-joblib==1.4.2
-jsonpatch==1.33
-jsonpointer==3.0.0
-langchain==0.2.11
-langchain-community==0.2.10
-langchain-core==0.2.23
-langchain-openai==0.1.17
-langchain-text-splitters==0.2.2
-langfuse==2.36.1
-langgraph==0.1.15
-langgraph-checkpoint==1.0.2
-langsmith==0.1.93
-lxml==5.2.2
-markdown-it-py==3.0.0
-MarkupSafe==2.1.5
-marshmallow==3.21.3
-mdurl==0.1.2
-mpmath==1.3.0
-msgpack==1.0.8
-multidict==6.0.5
-mypy_extensions==1.0.0
-networkx==3.3
-numpy==1.26.4
-openai==1.35.13
-orjson==3.10.6
-packaging==23.2
-pandas==2.2.2
-pillow==10.4.0
-primp==0.15.0
-pydantic==2.8.2
-pydantic_core==2.20.1
-Pygments==2.18.0
-PyPDF2==3.0.1
-pytesseract==0.3.10
-python-dateutil==2.9.0.post0
-python-dotenv==1.0.1
-pytz==2024.1
-PyYAML==6.0.1
-regex==2024.7.24
-requests==2.32.3
-requests-toolbelt==1.0.0
-rich==13.7.1
-safetensors==0.4.3
-scikit-learn==1.5.1
-scipy==1.14.0
-sentence-transformers==3.0.1
-six==1.16.0
-sniffio==1.3.1
-SQLAlchemy==2.0.31
-sympy==1.13.1
-tenacity==8.5.0
-threadpoolctl==3.5.0
-tiktoken==0.7.0
-tokenizers==0.19.1
-torch==2.2.2
-tqdm==4.66.4
-transformers==4.42.4
-typing-inspect==0.9.0
-typing_extensions==4.12.2
-tzdata==2024.1
-urllib3==2.2.2
-wrapt==1.16.0
-xxhash==3.4.1
-yarl==1.9.4
-gradio[oauth]==4.44.1

+gradio
+requests
+pandas
+PyPDF2
+easyocr
+langchain
+langchain-community
+langgraph
+sentence-transformers
+huggingface_hub
+python-dotenv
+aiohttp
+nest-asyncio
+sympy
+openpyxl
+smolagents
+datasets
+asyncio

retriever.py ADDED Viewed

	@@ -0,0 +1,34 @@

+import datasets
+from langchain.docstore.document import Document
+from langchain_community.retrievers import BM25Retriever
+from smolagents import Tool
+def load_guest_dataset():
+    try:
+        guest_dataset = datasets.load_dataset("agents-course/unit3-invitees", split="train")
+        docs = [
+            Document(
+                page_content="\n".join([
+                    f"Name: {guest['name']}",
+                    f"Relation: {guest['relation']}",
+                    f"Description: {guest['description']}",
+                    f"Email: {guest['email']}"
+                ]),
+                metadata={"name": guest["name"]}
+            )
+            for guest in guest_dataset
+        ]
+    except Exception as e:
+        # Fallback mock dataset
+        docs = [
+            Document(
+                page_content="\n".join([
+                    "Name: Dr. Nikola Tesla",
+                    "Relation: old friend from university days",
+                    "Description: Dr. Nikola Tesla is an old friend from your university days. He's recently patented a new wireless energy transmission system.",
+                    "Email: nikola.tesla@gmail.com"
+                ]),
+                metadata={"name": "Dr. Nikola Tesla"}
+            )
+        ]
+    return docs

state_log.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tools/__init__.py CHANGED Viewed

@@ -2,4 +2,8 @@ from .search import search_tool, multi_hop_search_tool
 from .file_parser import file_parser_tool
 from .image_parser import image_parser_tool
 from .calculator import calculator_tool
-from .document_retriever import document_retriever_tool

 from .file_parser import file_parser_tool
 from .image_parser import image_parser_tool
 from .calculator import calculator_tool
+from .document_retriever import document_retriever_tool
+from .duckduckgo_search import duckduckgo_search_tool
+from .weather_info import weather_info_tool
+from .hub_stats import hub_stats_tool
+from .guest_info import guest_info_retriever_tool

tools/calculator.py CHANGED Viewed

@@ -6,7 +6,7 @@ logger = logging.getLogger(__name__)
 @tool
 async def calculator_tool(expression: str) -> str:
-    """Evaluate a mathematical expression"""
     try:
         result = sympify(expression)
         return str(result)

 @tool
 async def calculator_tool(expression: str) -> str:
+    """Evaluate a mathematical expression."""
     try:
         result = sympify(expression)
         return str(result)

tools/document_retriever.py CHANGED Viewed

@@ -7,7 +7,7 @@ logger = logging.getLogger(__name__)
 @tool
 async def document_retriever_tool(task_id: str, query: str, file_type: str) -> str:
-    """Retrieve content from a document"""
     try:
         file_path = f"temp_{task_id}.{file_type}"
         if not os.path.exists(file_path):

 @tool
 async def document_retriever_tool(task_id: str, query: str, file_type: str) -> str:
+    """Retrieve content from a document."""
     try:
         file_path = f"temp_{task_id}.{file_type}"
         if not os.path.exists(file_path):

tools/duckduckgo_search.py ADDED Viewed

	@@ -0,0 +1,6 @@

+from smolagents import Tool, DuckDuckGoSearchTool
+import logging
+logger = logging.getLogger(__name__)
+duckduckgo_search_tool = DuckDuckGoSearchTool()

tools/file_parser.py CHANGED Viewed

@@ -8,7 +8,7 @@ logger = logging.getLogger(__name__)
 @tool
 async def file_parser_tool(task_id: str, file_type: str) -> str:
-    """Parse a file based on task_id and file_type"""
     try:
         file_path = f"temp_{task_id}.{file_type}"
         if not os.path.exists(file_path):
@@ -26,6 +26,9 @@ async def file_parser_tool(task_id: str, file_type: str) -> str:
                 reader = PyPDF2.PdfReader(f)
                 text = "".join(page.extract_text() for page in reader.pages)
                 return text
         else:
             return f"Unsupported file type: {file_type}"
     except Exception as e:

 @tool
 async def file_parser_tool(task_id: str, file_type: str) -> str:
+    """Parse a file based on task_id and file_type."""
     try:
         file_path = f"temp_{task_id}.{file_type}"
         if not os.path.exists(file_path):
                 reader = PyPDF2.PdfReader(f)
                 text = "".join(page.extract_text() for page in reader.pages)
                 return text
+        elif file_type in ["xlsx", "xls"]:
+            df = pd.read_excel(file_path, engine="openpyxl")
+            return df.to_string()
         else:
             return f"Unsupported file type: {file_type}"
     except Exception as e:

tools/guest_info.py ADDED Viewed

	@@ -0,0 +1,20 @@

+from langchain_core.tools import tool
+from retriever import load_guest_dataset
+import logging
+logger = logging.getLogger(__name__)
+@tool
+async def guest_info_retriever_tool(query: str) -> str:
+    """Retrieve detailed information about gala guests based on their name or relation."""
+    try:
+        docs = load_guest_dataset()
+        from langchain_community.retrievers import BM25Retriever
+        retriever = BM25Retriever.from_documents(docs)
+        results = retriever.get_relevant_documents(query)
+        if results:
+            return "\n\n".join([doc.page_content for doc in results[:3]])
+        return "No matching guest information found."
+    except Exception as e:
+        logger.error(f"Error retrieving guest info for query '{query}': {e}")
+        return f"Error: {str(e)}"

tools/hub_stats.py ADDED Viewed

	@@ -0,0 +1,18 @@

+from langchain_core.tools import tool
+from huggingface_hub import list_models
+import logging
+logger = logging.getLogger(__name__)
+@tool
+async def hub_stats_tool(author: str) -> str:
+    """Fetch the most downloaded model from a specific author on Hugging Face Hub."""
+    try:
+        models = list(list_models(author=author, sort="downloads", direction=-1, limit=1))
+        if models:
+            model = models[0]
+            return f"The most downloaded model by {author} is {model.id} with {model.downloads:,} downloads."
+        return f"No models found for author {author}."
+    except Exception as e:
+        logger.error(f"Error fetching models for {author}: {e}")
+        return f"Error: {str(e)}"

tools/image_parser.py CHANGED Viewed

@@ -4,12 +4,11 @@ import logging
 import os
 logger = logging.getLogger(__name__)
 reader = easyocr.Reader(['en'])
 @tool
 async def image_parser_tool(file_path: str, task: str = "describe", match_query: str = "") -> str:
-    """Parse text from an image"""
     try:
         if not os.path.exists(file_path):
             logger.warning(f"Image not found: {file_path}")

 import os
 logger = logging.getLogger(__name__)
 reader = easyocr.Reader(['en'])
 @tool
 async def image_parser_tool(file_path: str, task: str = "describe", match_query: str = "") -> str:
+    """Parse text from an image."""
     try:
         if not os.path.exists(file_path):
             logger.warning(f"Image not found: {file_path}")

tools/retriever.py DELETED Viewed

@@ -1,80 +0,0 @@
-from langchain.text_splitter import RecursiveCharacterTextSplitter
-from sentence_transformers import SentenceTransformer
-import numpy as np
-import pandas as pd
-import PyPDF2
-import os
-from typing import List, Dict
-class DocumentRetrieverTool:
-    def __init__(self):
-        self.name = "document_retriever"
-        self.description = "Retrieves relevant text from GAIA text-heavy files (CSV, TXT, PDF) using semantic search."
-        self.inputs = {
-            "task_id": {"type": "string", "description": "GAIA task ID for the file"},
-            "query": {"type": "string", "description": "Question or query to search for"},
-            "file_type": {"type": "string", "description": "File type (csv, txt, pdf, default: txt)"}
-        }
-        self.output_type = str
-        self.embedder = SentenceTransformer("all-MiniLM-L6-v2")
-        self.text_splitter = RecursiveCharacterTextSplitter(
-            chunk_size=500,
-            chunk_overlap=50,
-            length_function=len
-        )
-        self.chunks: List[str] = []
-        self.embeddings: np.ndarray = None
-    async def aparse(self, task_id: str, query: str, file_type: str = "txt") -> str:
-        """
-        Loads a GAIA file, splits it into chunks, embeds them, and retrieves relevant text for the query.
-        Supports CSV, TXT, and PDF files.
-        """
-        try:
-            file_path = f"temp_{task_id}.{file_type}"
-            if not os.path.exists(file_path):
-                return f"File not found for task ID {task_id}"
-            # Load and preprocess file
-            text = ""
-            if file_type == "csv":
-                df = pd.read_csv(file_path)
-                text = df.to_string()
-            elif file_type == "txt":
-                with open(file_path, "r", encoding="utf-8") as f:
-                    text = f.read()
-            elif file_type == "pdf":
-                with open(file_path, "rb") as f:
-                    pdf = PyPDF2.PdfReader(f)
-                    text = "".join(page.extract_text() or "" for page in pdf.pages)
-            else:
-                return f"Unsupported file type: {file_type}"
-            # Check if text was extracted
-            if not text.strip():
-                return "No extractable text found in file."
-            # Split text into chunks
-            self.chunks = self.text_splitter.split_text(text)
-            if not self.chunks:
-                return "No content found in file."
-            # Embed chunks and query
-            self.embeddings = self.embedder.encode(self.chunks, convert_to_tensor=True)
-            query_embedding = self.embedder.encode(query, convert_to_tensor=True)
-            # Compute cosine similarities
-            from sentence_transformers import util
-            similarities = util.cos_sim(query_embedding, self.embeddings)[0]
-            # Get top 3 most relevant chunks
-            top_k = min(3, len(self.chunks))
-            top_indices = similarities.argsort(descending=True)[:top_k]
-            relevant_chunks = [self.chunks[idx] for idx in top_indices]
-            # Combine results
-            return "\n\n".join(relevant_chunks)
-        except Exception as e:
-            return f"Error retrieving documents: {str(e)}"
-document_retriever_tool = DocumentRetrieverTool()

tools/search.py CHANGED Viewed

@@ -1,91 +1,46 @@
 from langchain_core.tools import tool
-from langchain_huggingface import HuggingFacePipeline
-from sentence_transformers import SentenceTransformer
 import logging
-from typing import List, Dict, Any
 import requests
 import os
 logger = logging.getLogger(__name__)
-# Initialize embedding model (free, open-source)
-try:
-    embedder = SentenceTransformer("all-MiniLM-L6-v2")
-except Exception as e:
-    logger.error(f"Failed to initialize embedding model: {e}")
-    embedder = None
-# Global LLM instance
-search_llm = None
-def initialize_search_tools(llm: HuggingFacePipeline) -> None:
-    """Initialize search tools with the provided LLM"""
-    global search_llm
-    search_llm = llm
-    logger.info("Search tools initialized with HuggingFace LLM")
 @tool
 async def search_tool(query: str) -> List[Dict[str, Any]]:
-    """Perform a web search using the query"""
     try:
-        if not search_llm:
-            logger.warning("Search LLM not initialized")
-            return [{"content": "Search unavailable", "url": ""}]
-        # Refine query using LLM
-        prompt = f"Refine this search query for better results: {query}"
-        response = await search_llm.ainvoke(prompt)
-        refined_query = response.content.strip()
-        # Check for SerpAPI key (free tier available)
         serpapi_key = os.getenv("SERPAPI_API_KEY")
-        if serpapi_key:
-            try:
-                params = {"q": refined_query, "api_key": serpapi_key}
-                response = requests.get("https://serpapi.com/search", params=params)
-                response.raise_for_status()
-                results = response.json().get("organic_results", [])
-                return [{"content": r.get("snippet", ""), "url": r.get("link", "")} for r in results]
-            except Exception as e:
-                logger.warning(f"SerpAPI failed: {e}, falling back to mock search")
-        # Mock search if no API key or API fails
-        if embedder:
-            query_embedding = embedder.encode(refined_query)
-            results = [
-                {"content": f"Mock result for {refined_query}", "url": "https://example.com"},
-                {"content": f"Another mock result for {refined_query}", "url": "https://example.org"}
-            ]
-        else:
-            results = [{"content": "Embedding model unavailable", "url": ""}]
-        logger.info(f"Search results for query '{refined_query}': {len(results)} items")
-        return results
     except Exception as e:
         logger.error(f"Error in search_tool: {e}")
         return [{"content": f"Search failed: {str(e)}", "url": ""}]
 @tool
 async def multi_hop_search_tool(query: str, steps: int = 3) -> List[Dict[str, Any]]:
-    """Perform a multi-hop search by iteratively refining the query"""
     try:
-        if not search_llm:
-            logger.warning("Search LLM not initialized")
-            return [{"content": "Multi-hop search unavailable", "url": ""}]
         results = []
         current_query = query
         for step in range(steps):
-            prompt = f"Based on the query '{current_query}', generate a follow-up question to deepen the search."
-            response = await search_llm.ainvoke(prompt)
-            next_query = response.content.strip()
-            step_results = await search_tool.invoke({"query": next_query})
             results.extend(step_results)
-            current_query = next_query
-            logger.info(f"Multi-hop step {step + 1}: {next_query}")
-        return results
     except Exception as e:
         logger.error(f"Error in multi_hop_search_tool: {e}")
         return [{"content": f"Multi-hop search failed: {str(e)}", "url": ""}]

 from langchain_core.tools import tool
 import logging
 import requests
 import os
+from typing import List, Dict, Any
+from dotenv import load_dotenv
 logger = logging.getLogger(__name__)
+load_dotenv()
 @tool
 async def search_tool(query: str) -> List[Dict[str, Any]]:
+    """Perform a web search using SERPAPI."""
     try:
         serpapi_key = os.getenv("SERPAPI_API_KEY")
+        if not serpapi_key:
+            logger.error("SERPAPI_API_KEY not set")
+            return [{"content": "Search unavailable: API key missing", "url": ""}]
+        params = {"q": query, "api_key": serpapi_key}
+        response = requests.get("https://serpapi.com/search", params=params, timeout=10)
+        response.raise_for_status()
+        results = response.json().get("organic_results", [])
+        logger.info(f"Search results for query '{query}': {len(results)} items")
+        search_results = [{"content": r.get("snippet", ""), "url": r.get("link", "")} for r in results]
+        return search_results or [{"content": "No search results", "url": ""}]
     except Exception as e:
         logger.error(f"Error in search_tool: {e}")
         return [{"content": f"Search failed: {str(e)}", "url": ""}]
 @tool
 async def multi_hop_search_tool(query: str, steps: int = 3) -> List[Dict[str, Any]]:
+    """Perform a multi-hop search."""
     try:
         results = []
         current_query = query
         for step in range(steps):
+            step_results = await search_tool.invoke({"query": current_query})
             results.extend(step_results)
+            current_query = f"{current_query} more details"
+            logger.info(f"Multi-hop step {step + 1}: {current_query}")
+            await asyncio.sleep(2)  # Avoid rate limits
+        return results or [{"content": "No multi-hop results", "url": ""}]
     except Exception as e:
         logger.error(f"Error in multi_hop_search_tool: {e}")
         return [{"content": f"Multi-hop search failed: {str(e)}", "url": ""}]

tools/weather_info.py ADDED Viewed

	@@ -0,0 +1,28 @@

+from langchain_core.tools import tool
+import requests
+import logging
+import os
+from dotenv import load_dotenv
+logger = logging.getLogger(__name__)
+load_dotenv()
+@tool
+async def weather_info_tool(location: str) -> str:
+    """Fetch real weather information for a given location."""
+    try:
+        api_key = os.getenv("OPENWEATHERMAP_API_KEY")
+        if not api_key:
+            logger.error("OPENWEATHERMAP_API_KEY not set")
+            return "Weather unavailable: API key missing"
+        url = f"http://api.openweathermap.org/data/2.5/weather?q={location}&appid={api_key}&units=metric"
+        response = requests.get(url).json()
+        if response.get("cod") == 200:
+            condition = response["weather"][0]["description"]
+            temp = response["main"]["temp"]
+            return f"Weather in {location}: {condition}, {temp}°C"
+        return f"Unable to fetch weather for {location}."
+    except Exception as e:
+        logger.error(f"Error fetching weather for {location}: {e}")
+        return f"Error: {str(e)}"