Spaces:

HASHIRUAgentX
/

hashiruAI

Running

App Files Files Community

Kunal Pai commited on May 19

Commit

4be2bb5

1 Parent(s): f3a6d23

Python file + README cleanup

Browse files

Files changed (5) hide show

README.md +28 -66
src/cost_benefit.py +0 -48
src/models_cost.py +0 -114
src/test_env.py +0 -61
src/testing_cost.py +0 -15

README.md CHANGED Viewed

@@ -1,82 +1,44 @@
 # HASHIRU: Hierarchical Agent System for Hybrid Intelligent Resource Utilization
-(For AgentX competition)
 ![HASHIRU_ARCH](HASHIRU_ARCH.png)
-## Overview
-HASHIRU is an agent-based framework designed to dynamically allocate and manage large language models (LLMs) and external APIs through a CEO model. The CEO model acts as a central manager, capable of hiring, firing, and directing multiple specialized agents (employees) over a given budget. It can also create and utilize external APIs as needed, making it highly flexible and scalable.
-## High-Level Overview
-1. Loads available tools using ToolLoader
-2. Instantiates a Gemini-powered CEO using GeminiManager
-3. Wraps the user prompt into a structured Content object
-4. Calls Gemini with the prompt
-5. Executes external tool calls if needed
-6. Returns a full response to the user
-7. Can ask the user for clarification if needed
-After every step in the request, there is a checkpoint so that we can check what happened in that step (i.e., which tool was called, what was the response, etc.). This is useful for debugging and understanding the flow of the program.
-## Features
-- **Cost-Benefit Matrix**:
-  Selects the best LLM model (LLaMA, Mixtral, Gemini, DeepSeek, etc.) for any task using Ollama, based on latency, size, cost, quality, and speed.
-- **Dynamic Agent Management**:
-  The CEO agent dynamically hires and fires specialized agents based on task requirements and budget constraints.
-- **API Integration**:
-  Seamlessly integrates external APIs for enhanced functionality and scalability.
-## How to Run
-Clone the repo and install the required dependencies:
-```bash
-git clone <repository-url>
-cd HASHIRU
-pip install -r requirements.txt
-```
-Run the main script:
-```bash
-python main.py
-```
 ## Usage
-### Cost-Benefit Matrix
-The `cost_benefit.py` tool allows you to select the best LLM model for a given task based on customizable weights:
-```bash
-python tools/cost_benefit.py \
-  --prompt "Best places to visit in Davis" \
-  --latency 4 --size 2 --cost 5 --speed 3
-```
-Each weight is on a scale of **1** (least important) to **5** (most important):
-- `--latency`: Prefer faster responses (lower time to answer)
-- `--size`: Prefer smaller models (use less memory/resources)
-- `--cost`: Prefer cheaper responses (fewer tokens, lower token price)
-- `--speed`: Prefer models that generate tokens quickly (tokens/sec)
-### Example Output
-```plaintext
-Selected Model: Gemini
-Reason: Optimal balance of cost, speed, and latency for the given weights.
-```
 ## Contributing
-Contributions are welcome! Please follow these steps:
-1. Fork the repository.
-2. Create a new branch for your feature or bug fix.
-3. Submit a pull request with a detailed description of your changes.
-## License
-This project is licensed under the MIT License. See the `LICENSE` file for details.

 # HASHIRU: Hierarchical Agent System for Hybrid Intelligent Resource Utilization
 ![HASHIRU_ARCH](HASHIRU_ARCH.png)
+## Project Overview
+This project provides a framework for creating and managing AI agents and tools. It includes features for managing resource and expense budgets, loading tools and agents, and interacting with various language models.
+## Directory Structure
+*   **src/**: Contains the source code for the project.
+    *   **tools/**: Contains the code for the tools that can be used by the agents.
+        *   **default\_tools/**: Contains the default tools provided with the project.
+        *   **user\_tools/**: Contains the tools created by the user.
+    *   **config/**: Contains configuration files for the project.
+    *   **utils/**: Contains utility functions and classes used throughout the project.
+    *   **models/**: Contains the configurations and system prompts for the agents. Includes `models.json` which stores agent definitions.
+    *   **manager/**: Contains the core logic for managing agents, tools, and budgets.
+        *   `agent_manager.py`: Manages the creation, deletion, and invocation of AI agents. Supports different agent types like Ollama, Gemini, and Groq.
+        *   `budget_manager.py`: Manages the resource and expense budgets for the project.
+        *   `tool_manager.py`: Manages the loading, running, and deletion of tools.
+        *   `llm_models.py`: Defines abstract base classes for different language model integrations.
+    *   **data/**: Contains data files, such as memory and secret words.
+## Key Components
+*   **Agent Management:** The `AgentManager` class in `src/manager/agent_manager.py` is responsible for creating, managing, and invoking AI agents. It supports different agent types, including local (Ollama) and cloud-based (Gemini, Groq) models.
+*   **Tool Management:** The `ToolManager` class in `src/manager/tool_manager.py` handles the loading and running of tools. Tools are loaded from the `src/tools/default_tools` and `src/tools/user_tools` directories.
+*   **Budget Management:** The `BudgetManager` class in `src/manager/budget_manager.py` manages the resource and expense budgets for the project. It tracks the usage of resources and expenses and enforces budget limits.
+*   **Model Integration:** The project supports integration with various language models, including Ollama, Gemini, and Groq. The `llm_models.py` file defines abstract base classes for these integrations.
 ## Usage
+To use the project, you need to:
+1.  Configure the budget in `src/manager/budget_manager.py`.
+2.  Create tools and place them in the `src/tools/default_tools` or `src/tools/user_tools` directories.
+3.  Create agents using the `AgentCreator` tool or the `AgentManager` class.
+4.  Invoke agents using the `AskAgent` tool or the `AgentManager` class.
+5.  Manage tools and agents using the `ToolManager` and `AgentManager` classes.
 ## Contributing
+Contributions are welcome! Please submit pull requests with bug fixes, new features, or improvements to the documentation.

src/cost_benefit.py DELETED Viewed

@@ -1,48 +0,0 @@
-import argparse
-import subprocess
-import time
-import requests
-def detect_available_budget(runtime_env: str) -> int:
-    """
-    Return an approximate VRAM‑based budget (MB) when running locally,
-    else default to 100.
-    """
-    import torch
-    if "local" in runtime_env and torch.cuda.is_available():
-        total_vram_mb = torch.cuda.get_device_properties(0).total_memory // (1024 ** 2)
-        return min(total_vram_mb, 100)
-    return 100
-def get_best_model(runtime_env: str, *, use_local_only: bool = False, use_api_only: bool = False) -> dict:
-    """
-    Pick the fastest model that fits in the detected budget while
-    respecting the locality filters.
-    """
-    static_costs = {
-        "llama3.2":  {"size": 20, "token_cost": 0.0001, "tokens_sec": 30, "type": "local"},
-        "mistral":   {"size": 40, "token_cost": 0.0002, "tokens_sec": 50, "type": "local"},
-        "gemini-2.0-flash":            {"size": 60, "token_cost": 0.0005, "tokens_sec": 60, "type": "api"},
-        "gemini-2.5-pro-preview-03-25": {"size": 80, "token_cost": 0.002 , "tokens_sec": 45, "type": "api"},
-    }
-    budget = detect_available_budget(runtime_env)
-    best_model, best_speed = None, -1
-    for model, info in static_costs.items():
-        if info["size"] > budget:
-            continue
-        if use_local_only and info["type"] != "local":
-            continue
-        if use_api_only and info["type"] != "api":
-            continue
-        if info["tokens_sec"] > best_speed:
-            best_model, best_speed = model, info["tokens_sec"]
-    chosen = best_model or "llama3.2"  # sensible default
-    return {
-        "model": chosen,
-        "token_cost": static_costs[chosen]["token_cost"],
-        "tokens_sec": static_costs[chosen]["tokens_sec"],
-        "note": None if best_model else "Defaulted because no model met the constraints",
-    }

src/models_cost.py DELETED Viewed

@@ -1,114 +0,0 @@
-from dataclasses import dataclass
-from typing import Dict
-from manager.utils.runtime_selector import detect_runtime_environment
-@dataclass
-class ModelInfo:
-    name: str
-    size: float
-    tokens_sec: int
-    type: str
-    description: str
-    create_cost: int = 0
-    invoke_cost: int = 0
-class ModelRegistry:
-    def __init__(self):
-        self.env = detect_runtime_environment()
-        self.models = self._build_model_registry()
-    def estimate_create_cost(self, size: float, is_api: bool) -> int:
-        return int(size * (10 if is_api else 5))
-    def estimate_invoke_cost(self, tokens_sec: int, is_api: bool) -> int:
-        base_cost = 40 if is_api else 20
-        return base_cost + max(0, 60 - tokens_sec)
-    def _build_model_registry(self) -> Dict[str, ModelInfo]:
-        raw_models = {
-            "llama3.2": {
-                "size": 3,
-                "tokens_sec": 30,
-                "type": "local",
-                "description": "3B lightweight local model"
-            },
-            "mistral": {
-                "size": 7,
-                "tokens_sec": 50,
-                "type": "local",
-                "description": "7B stronger local model"
-            },
-            "gemini-2.0-flash": {
-                "size": 6,
-                "tokens_sec": 170,
-                "type": "api",
-                "description": "Fast and efficient API model"
-            },
-            "gemini-2.5-pro-preview-03-25": {
-                "size": 10,
-                "tokens_sec": 148,
-                "type": "api",
-                "description": "High-reasoning API model"
-            },
-            "gemini-1.5-flash": {
-                "size": 7,
-                "tokens_sec": 190,
-                "type": "api",
-                "description": "Fast general-purpose model"
-            },
-            "gemini-2.0-flash-lite": {
-                "size": 5,
-                "tokens_sec": 208,
-                "type": "api",
-                "description": "Low-latency, cost-efficient API model"
-            },
-            "gemini-2.0-flash-live-001": {
-                "size": 9,
-                "tokens_sec": 190,
-                "type": "api",
-                "description": "Voice/video low-latency API model"
-            }
-        }
-        models = {}
-        for name, model in raw_models.items():
-            is_api = model["type"] == "api"
-            if is_api:
-                # Flat cost for all API models
-                create_cost, invoke_cost = 20, 50
-            else:
-                create_cost = self.estimate_create_cost(model["size"], is_api=False)
-                invoke_cost = self.estimate_invoke_cost(model["tokens_sec"], is_api=False)
-            models[name] = ModelInfo(
-                name=name,
-                size=model["size"],
-                tokens_sec=model["tokens_sec"],
-                type=model["type"],
-                description=model["description"],
-                create_cost=create_cost,
-                invoke_cost=invoke_cost
-            )
-        return models
-    def get_filtered_models(self) -> Dict[str, ModelInfo]:
-        """Return only models that match the current runtime."""
-        if self.env in ["gpu", "cpu-local"]:
-            return {k: v for k, v in self.models.items() if v.type == "local"}
-        else:
-            return {k: v for k, v in self.models.items() if v.type == "api"}
-    def get_all_models(self) -> Dict[str, ModelInfo]:
-        """Return all models regardless of runtime."""
-        return self.models
-if __name__ == "__main__":
-    registry = ModelRegistry()
-    print(f"[INFO] Detected runtime: {registry.env}\n")
-    print("Filtered models based on environment:")
-    for name, model in registry.get_filtered_models().items():
-        print(f"{name}: create={model.create_cost}, invoke={model.invoke_cost}, type={model.type}")

src/test_env.py DELETED Viewed

@@ -1,61 +0,0 @@
-import os
-from dotenv import load_dotenv
-load_dotenv()
-from src.manager import GeminiManager
-from src.tool_manager import ToolManager
-import gradio as gr
-import base64
-_logo_bytes = open("HASHIRU_LOGO.png", "rb").read()
-_logo_b64 = base64.b64encode(_logo_bytes).decode()
-_header_html = f"""
-<div style="
-    display: flex;
-    flex-direction: column;
-    align-items: center;
-    padding-right: 24px;
-">
-  <img src="data:image/png;base64,{_logo_b64}" width="20" height="20" />
-  <span style="margin-top: 8px; font-size: 20px; font-weight: bold; color: white;">
-    HASHIRU AI - Runtime Test
-  </span>
-</div>
-"""
-# -------------------------------
-# ToolManager Agent Creation
-# -------------------------------
-def create_agent_callback():
-    print("\n[INFO] Creating agent using ToolManager...")
-    manager = ToolManager()
-    response = manager.runTool("AgentCreator", {
-        "agent_name": "ui-runtime-agent",
-        "system_prompt": "You answer questions.",
-        "description": "Agent created via ToolManager",
-        # No base_model passed — will trigger dynamic selection
-    })
-    print("[TOOL RESPONSE]", response)
-    return response["message"]
-# -------------------------------
-# Gradio UI
-# -------------------------------
-if __name__ == "__main__":
-    css = """
-    #title-row { background: #2c2c2c; border-radius: 8px; padding: 8px; }
-    """
-    with gr.Blocks(css=css, fill_width=True, fill_height=True) as demo:
-        with gr.Column():
-            gr.HTML(_header_html)
-            agent_create_button = gr.Button("🧪 Create Agent via ToolManager")
-            result_output = gr.Textbox(label="Tool Response")
-            agent_create_button.click(
-                fn=create_agent_callback,
-                inputs=[],
-                outputs=[result_output]
-            )
-    demo.launch()

src/testing_cost.py DELETED Viewed

@@ -1,15 +0,0 @@
-# In tool_manager.py
-from default_tools.test_cost.agent_creator_tool import AgentCreator
-def test_agent_creation():
-    creator = AgentCreator()
-    response = creator.run(
-        agent_name="costbenefit-test-agent",
-        system_prompt="You are an expert assistant helping with cost-based model selection.",
-        description="Tests agent creation using cost-benefit logic."
-    )
-    print("\nTest Output:")
-    print(response)
-if __name__ == "__main__":
-    test_agent_creation()