Kunal Pai commited on
Commit
4be2bb5
·
1 Parent(s): f3a6d23

Python file + README cleanup

Browse files
Files changed (5) hide show
  1. README.md +28 -66
  2. src/cost_benefit.py +0 -48
  3. src/models_cost.py +0 -114
  4. src/test_env.py +0 -61
  5. src/testing_cost.py +0 -15
README.md CHANGED
@@ -1,82 +1,44 @@
1
  # HASHIRU: Hierarchical Agent System for Hybrid Intelligent Resource Utilization
2
- (For AgentX competition)
3
 
4
  ![HASHIRU_ARCH](HASHIRU_ARCH.png)
5
 
6
- ## Overview
7
- HASHIRU is an agent-based framework designed to dynamically allocate and manage large language models (LLMs) and external APIs through a CEO model. The CEO model acts as a central manager, capable of hiring, firing, and directing multiple specialized agents (employees) over a given budget. It can also create and utilize external APIs as needed, making it highly flexible and scalable.
8
 
9
- ## High-Level Overview
10
 
11
- 1. Loads available tools using ToolLoader
12
- 2. Instantiates a Gemini-powered CEO using GeminiManager
13
- 3. Wraps the user prompt into a structured Content object
14
- 4. Calls Gemini with the prompt
15
- 5. Executes external tool calls if needed
16
- 6. Returns a full response to the user
17
- 7. Can ask the user for clarification if needed
18
 
19
- After every step in the request, there is a checkpoint so that we can check what happened in that step (i.e., which tool was called, what was the response, etc.). This is useful for debugging and understanding the flow of the program.
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
- ## Features
22
- - **Cost-Benefit Matrix**:
23
- Selects the best LLM model (LLaMA, Mixtral, Gemini, DeepSeek, etc.) for any task using Ollama, based on latency, size, cost, quality, and speed.
24
- - **Dynamic Agent Management**:
25
- The CEO agent dynamically hires and fires specialized agents based on task requirements and budget constraints.
26
- - **API Integration**:
27
- Seamlessly integrates external APIs for enhanced functionality and scalability.
28
 
29
- ## How to Run
30
-
31
- Clone the repo and install the required dependencies:
32
-
33
- ```bash
34
- git clone <repository-url>
35
- cd HASHIRU
36
- pip install -r requirements.txt
37
- ```
38
-
39
- Run the main script:
40
-
41
- ```bash
42
- python main.py
43
- ```
44
 
45
  ## Usage
46
 
47
- ### Cost-Benefit Matrix
48
-
49
- The `cost_benefit.py` tool allows you to select the best LLM model for a given task based on customizable weights:
50
-
51
- ```bash
52
- python tools/cost_benefit.py \
53
- --prompt "Best places to visit in Davis" \
54
- --latency 4 --size 2 --cost 5 --speed 3
55
- ```
56
-
57
- Each weight is on a scale of **1** (least important) to **5** (most important):
58
 
59
- - `--latency`: Prefer faster responses (lower time to answer)
60
- - `--size`: Prefer smaller models (use less memory/resources)
61
- - `--cost`: Prefer cheaper responses (fewer tokens, lower token price)
62
- - `--speed`: Prefer models that generate tokens quickly (tokens/sec)
63
-
64
- ### Example Output
65
-
66
- ```plaintext
67
- Selected Model: Gemini
68
- Reason: Optimal balance of cost, speed, and latency for the given weights.
69
- ```
70
 
71
  ## Contributing
72
 
73
- Contributions are welcome! Please follow these steps:
74
-
75
- 1. Fork the repository.
76
- 2. Create a new branch for your feature or bug fix.
77
- 3. Submit a pull request with a detailed description of your changes.
78
-
79
- ## License
80
-
81
- This project is licensed under the MIT License. See the `LICENSE` file for details.
82
-
 
1
  # HASHIRU: Hierarchical Agent System for Hybrid Intelligent Resource Utilization
 
2
 
3
  ![HASHIRU_ARCH](HASHIRU_ARCH.png)
4
 
5
+ ## Project Overview
 
6
 
7
+ This project provides a framework for creating and managing AI agents and tools. It includes features for managing resource and expense budgets, loading tools and agents, and interacting with various language models.
8
 
9
+ ## Directory Structure
 
 
 
 
 
 
10
 
11
+ * **src/**: Contains the source code for the project.
12
+ * **tools/**: Contains the code for the tools that can be used by the agents.
13
+ * **default\_tools/**: Contains the default tools provided with the project.
14
+ * **user\_tools/**: Contains the tools created by the user.
15
+ * **config/**: Contains configuration files for the project.
16
+ * **utils/**: Contains utility functions and classes used throughout the project.
17
+ * **models/**: Contains the configurations and system prompts for the agents. Includes `models.json` which stores agent definitions.
18
+ * **manager/**: Contains the core logic for managing agents, tools, and budgets.
19
+ * `agent_manager.py`: Manages the creation, deletion, and invocation of AI agents. Supports different agent types like Ollama, Gemini, and Groq.
20
+ * `budget_manager.py`: Manages the resource and expense budgets for the project.
21
+ * `tool_manager.py`: Manages the loading, running, and deletion of tools.
22
+ * `llm_models.py`: Defines abstract base classes for different language model integrations.
23
+ * **data/**: Contains data files, such as memory and secret words.
24
 
25
+ ## Key Components
 
 
 
 
 
 
26
 
27
+ * **Agent Management:** The `AgentManager` class in `src/manager/agent_manager.py` is responsible for creating, managing, and invoking AI agents. It supports different agent types, including local (Ollama) and cloud-based (Gemini, Groq) models.
28
+ * **Tool Management:** The `ToolManager` class in `src/manager/tool_manager.py` handles the loading and running of tools. Tools are loaded from the `src/tools/default_tools` and `src/tools/user_tools` directories.
29
+ * **Budget Management:** The `BudgetManager` class in `src/manager/budget_manager.py` manages the resource and expense budgets for the project. It tracks the usage of resources and expenses and enforces budget limits.
30
+ * **Model Integration:** The project supports integration with various language models, including Ollama, Gemini, and Groq. The `llm_models.py` file defines abstract base classes for these integrations.
 
 
 
 
 
 
 
 
 
 
 
31
 
32
  ## Usage
33
 
34
+ To use the project, you need to:
 
 
 
 
 
 
 
 
 
 
35
 
36
+ 1. Configure the budget in `src/manager/budget_manager.py`.
37
+ 2. Create tools and place them in the `src/tools/default_tools` or `src/tools/user_tools` directories.
38
+ 3. Create agents using the `AgentCreator` tool or the `AgentManager` class.
39
+ 4. Invoke agents using the `AskAgent` tool or the `AgentManager` class.
40
+ 5. Manage tools and agents using the `ToolManager` and `AgentManager` classes.
 
 
 
 
 
 
41
 
42
  ## Contributing
43
 
44
+ Contributions are welcome! Please submit pull requests with bug fixes, new features, or improvements to the documentation.
 
 
 
 
 
 
 
 
 
src/cost_benefit.py DELETED
@@ -1,48 +0,0 @@
1
- import argparse
2
- import subprocess
3
- import time
4
- import requests
5
-
6
- def detect_available_budget(runtime_env: str) -> int:
7
- """
8
- Return an approximate VRAM‑based budget (MB) when running locally,
9
- else default to 100.
10
- """
11
- import torch
12
- if "local" in runtime_env and torch.cuda.is_available():
13
- total_vram_mb = torch.cuda.get_device_properties(0).total_memory // (1024 ** 2)
14
- return min(total_vram_mb, 100)
15
- return 100
16
-
17
- def get_best_model(runtime_env: str, *, use_local_only: bool = False, use_api_only: bool = False) -> dict:
18
- """
19
- Pick the fastest model that fits in the detected budget while
20
- respecting the locality filters.
21
- """
22
- static_costs = {
23
- "llama3.2": {"size": 20, "token_cost": 0.0001, "tokens_sec": 30, "type": "local"},
24
- "mistral": {"size": 40, "token_cost": 0.0002, "tokens_sec": 50, "type": "local"},
25
- "gemini-2.0-flash": {"size": 60, "token_cost": 0.0005, "tokens_sec": 60, "type": "api"},
26
- "gemini-2.5-pro-preview-03-25": {"size": 80, "token_cost": 0.002 , "tokens_sec": 45, "type": "api"},
27
- }
28
-
29
- budget = detect_available_budget(runtime_env)
30
- best_model, best_speed = None, -1
31
-
32
- for model, info in static_costs.items():
33
- if info["size"] > budget:
34
- continue
35
- if use_local_only and info["type"] != "local":
36
- continue
37
- if use_api_only and info["type"] != "api":
38
- continue
39
- if info["tokens_sec"] > best_speed:
40
- best_model, best_speed = model, info["tokens_sec"]
41
-
42
- chosen = best_model or "llama3.2" # sensible default
43
- return {
44
- "model": chosen,
45
- "token_cost": static_costs[chosen]["token_cost"],
46
- "tokens_sec": static_costs[chosen]["tokens_sec"],
47
- "note": None if best_model else "Defaulted because no model met the constraints",
48
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
src/models_cost.py DELETED
@@ -1,114 +0,0 @@
1
- from dataclasses import dataclass
2
- from typing import Dict
3
- from manager.utils.runtime_selector import detect_runtime_environment
4
-
5
- @dataclass
6
- class ModelInfo:
7
- name: str
8
- size: float
9
- tokens_sec: int
10
- type: str
11
- description: str
12
- create_cost: int = 0
13
- invoke_cost: int = 0
14
-
15
- class ModelRegistry:
16
- def __init__(self):
17
- self.env = detect_runtime_environment()
18
- self.models = self._build_model_registry()
19
-
20
- def estimate_create_cost(self, size: float, is_api: bool) -> int:
21
- return int(size * (10 if is_api else 5))
22
-
23
- def estimate_invoke_cost(self, tokens_sec: int, is_api: bool) -> int:
24
- base_cost = 40 if is_api else 20
25
- return base_cost + max(0, 60 - tokens_sec)
26
-
27
- def _build_model_registry(self) -> Dict[str, ModelInfo]:
28
- raw_models = {
29
- "llama3.2": {
30
- "size": 3,
31
- "tokens_sec": 30,
32
- "type": "local",
33
- "description": "3B lightweight local model"
34
- },
35
- "mistral": {
36
- "size": 7,
37
- "tokens_sec": 50,
38
- "type": "local",
39
- "description": "7B stronger local model"
40
- },
41
- "gemini-2.0-flash": {
42
- "size": 6,
43
- "tokens_sec": 170,
44
- "type": "api",
45
- "description": "Fast and efficient API model"
46
- },
47
- "gemini-2.5-pro-preview-03-25": {
48
- "size": 10,
49
- "tokens_sec": 148,
50
- "type": "api",
51
- "description": "High-reasoning API model"
52
- },
53
- "gemini-1.5-flash": {
54
- "size": 7,
55
- "tokens_sec": 190,
56
- "type": "api",
57
- "description": "Fast general-purpose model"
58
- },
59
- "gemini-2.0-flash-lite": {
60
- "size": 5,
61
- "tokens_sec": 208,
62
- "type": "api",
63
- "description": "Low-latency, cost-efficient API model"
64
- },
65
- "gemini-2.0-flash-live-001": {
66
- "size": 9,
67
- "tokens_sec": 190,
68
- "type": "api",
69
- "description": "Voice/video low-latency API model"
70
- }
71
- }
72
-
73
-
74
- models = {}
75
- for name, model in raw_models.items():
76
- is_api = model["type"] == "api"
77
-
78
- if is_api:
79
- # Flat cost for all API models
80
- create_cost, invoke_cost = 20, 50
81
- else:
82
- create_cost = self.estimate_create_cost(model["size"], is_api=False)
83
- invoke_cost = self.estimate_invoke_cost(model["tokens_sec"], is_api=False)
84
-
85
- models[name] = ModelInfo(
86
- name=name,
87
- size=model["size"],
88
- tokens_sec=model["tokens_sec"],
89
- type=model["type"],
90
- description=model["description"],
91
- create_cost=create_cost,
92
- invoke_cost=invoke_cost
93
- )
94
- return models
95
-
96
- def get_filtered_models(self) -> Dict[str, ModelInfo]:
97
- """Return only models that match the current runtime."""
98
- if self.env in ["gpu", "cpu-local"]:
99
- return {k: v for k, v in self.models.items() if v.type == "local"}
100
- else:
101
- return {k: v for k, v in self.models.items() if v.type == "api"}
102
-
103
- def get_all_models(self) -> Dict[str, ModelInfo]:
104
- """Return all models regardless of runtime."""
105
- return self.models
106
-
107
-
108
- if __name__ == "__main__":
109
- registry = ModelRegistry()
110
- print(f"[INFO] Detected runtime: {registry.env}\n")
111
-
112
- print("Filtered models based on environment:")
113
- for name, model in registry.get_filtered_models().items():
114
- print(f"{name}: create={model.create_cost}, invoke={model.invoke_cost}, type={model.type}")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
src/test_env.py DELETED
@@ -1,61 +0,0 @@
1
- import os
2
- from dotenv import load_dotenv
3
- load_dotenv()
4
- from src.manager import GeminiManager
5
- from src.tool_manager import ToolManager
6
- import gradio as gr
7
- import base64
8
-
9
- _logo_bytes = open("HASHIRU_LOGO.png", "rb").read()
10
- _logo_b64 = base64.b64encode(_logo_bytes).decode()
11
- _header_html = f"""
12
- <div style="
13
- display: flex;
14
- flex-direction: column;
15
- align-items: center;
16
- padding-right: 24px;
17
- ">
18
- <img src="data:image/png;base64,{_logo_b64}" width="20" height="20" />
19
- <span style="margin-top: 8px; font-size: 20px; font-weight: bold; color: white;">
20
- HASHIRU AI - Runtime Test
21
- </span>
22
- </div>
23
- """
24
-
25
- # -------------------------------
26
- # ToolManager Agent Creation
27
- # -------------------------------
28
- def create_agent_callback():
29
- print("\n[INFO] Creating agent using ToolManager...")
30
- manager = ToolManager()
31
-
32
- response = manager.runTool("AgentCreator", {
33
- "agent_name": "ui-runtime-agent",
34
- "system_prompt": "You answer questions.",
35
- "description": "Agent created via ToolManager",
36
- # No base_model passed — will trigger dynamic selection
37
- })
38
-
39
- print("[TOOL RESPONSE]", response)
40
- return response["message"]
41
-
42
- # -------------------------------
43
- # Gradio UI
44
- # -------------------------------
45
- if __name__ == "__main__":
46
- css = """
47
- #title-row { background: #2c2c2c; border-radius: 8px; padding: 8px; }
48
- """
49
- with gr.Blocks(css=css, fill_width=True, fill_height=True) as demo:
50
- with gr.Column():
51
- gr.HTML(_header_html)
52
- agent_create_button = gr.Button("🧪 Create Agent via ToolManager")
53
- result_output = gr.Textbox(label="Tool Response")
54
-
55
- agent_create_button.click(
56
- fn=create_agent_callback,
57
- inputs=[],
58
- outputs=[result_output]
59
- )
60
-
61
- demo.launch()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
src/testing_cost.py DELETED
@@ -1,15 +0,0 @@
1
- # In tool_manager.py
2
- from default_tools.test_cost.agent_creator_tool import AgentCreator
3
-
4
- def test_agent_creation():
5
- creator = AgentCreator()
6
- response = creator.run(
7
- agent_name="costbenefit-test-agent",
8
- system_prompt="You are an expert assistant helping with cost-based model selection.",
9
- description="Tests agent creation using cost-benefit logic."
10
- )
11
- print("\nTest Output:")
12
- print(response)
13
-
14
- if __name__ == "__main__":
15
- test_agent_creation()