Spaces:
Running
Running
Kunal Pai
commited on
Commit
·
4be2bb5
1
Parent(s):
f3a6d23
Python file + README cleanup
Browse files- README.md +28 -66
- src/cost_benefit.py +0 -48
- src/models_cost.py +0 -114
- src/test_env.py +0 -61
- src/testing_cost.py +0 -15
README.md
CHANGED
@@ -1,82 +1,44 @@
|
|
1 |
# HASHIRU: Hierarchical Agent System for Hybrid Intelligent Resource Utilization
|
2 |
-
(For AgentX competition)
|
3 |
|
4 |

|
5 |
|
6 |
-
## Overview
|
7 |
-
HASHIRU is an agent-based framework designed to dynamically allocate and manage large language models (LLMs) and external APIs through a CEO model. The CEO model acts as a central manager, capable of hiring, firing, and directing multiple specialized agents (employees) over a given budget. It can also create and utilize external APIs as needed, making it highly flexible and scalable.
|
8 |
|
9 |
-
|
10 |
|
11 |
-
|
12 |
-
2. Instantiates a Gemini-powered CEO using GeminiManager
|
13 |
-
3. Wraps the user prompt into a structured Content object
|
14 |
-
4. Calls Gemini with the prompt
|
15 |
-
5. Executes external tool calls if needed
|
16 |
-
6. Returns a full response to the user
|
17 |
-
7. Can ask the user for clarification if needed
|
18 |
|
19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
|
21 |
-
##
|
22 |
-
- **Cost-Benefit Matrix**:
|
23 |
-
Selects the best LLM model (LLaMA, Mixtral, Gemini, DeepSeek, etc.) for any task using Ollama, based on latency, size, cost, quality, and speed.
|
24 |
-
- **Dynamic Agent Management**:
|
25 |
-
The CEO agent dynamically hires and fires specialized agents based on task requirements and budget constraints.
|
26 |
-
- **API Integration**:
|
27 |
-
Seamlessly integrates external APIs for enhanced functionality and scalability.
|
28 |
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
```bash
|
34 |
-
git clone <repository-url>
|
35 |
-
cd HASHIRU
|
36 |
-
pip install -r requirements.txt
|
37 |
-
```
|
38 |
-
|
39 |
-
Run the main script:
|
40 |
-
|
41 |
-
```bash
|
42 |
-
python main.py
|
43 |
-
```
|
44 |
|
45 |
## Usage
|
46 |
|
47 |
-
|
48 |
-
|
49 |
-
The `cost_benefit.py` tool allows you to select the best LLM model for a given task based on customizable weights:
|
50 |
-
|
51 |
-
```bash
|
52 |
-
python tools/cost_benefit.py \
|
53 |
-
--prompt "Best places to visit in Davis" \
|
54 |
-
--latency 4 --size 2 --cost 5 --speed 3
|
55 |
-
```
|
56 |
-
|
57 |
-
Each weight is on a scale of **1** (least important) to **5** (most important):
|
58 |
|
59 |
-
|
60 |
-
|
61 |
-
|
62 |
-
|
63 |
-
|
64 |
-
### Example Output
|
65 |
-
|
66 |
-
```plaintext
|
67 |
-
Selected Model: Gemini
|
68 |
-
Reason: Optimal balance of cost, speed, and latency for the given weights.
|
69 |
-
```
|
70 |
|
71 |
## Contributing
|
72 |
|
73 |
-
Contributions are welcome! Please
|
74 |
-
|
75 |
-
1. Fork the repository.
|
76 |
-
2. Create a new branch for your feature or bug fix.
|
77 |
-
3. Submit a pull request with a detailed description of your changes.
|
78 |
-
|
79 |
-
## License
|
80 |
-
|
81 |
-
This project is licensed under the MIT License. See the `LICENSE` file for details.
|
82 |
-
|
|
|
1 |
# HASHIRU: Hierarchical Agent System for Hybrid Intelligent Resource Utilization
|
|
|
2 |
|
3 |

|
4 |
|
5 |
+
## Project Overview
|
|
|
6 |
|
7 |
+
This project provides a framework for creating and managing AI agents and tools. It includes features for managing resource and expense budgets, loading tools and agents, and interacting with various language models.
|
8 |
|
9 |
+
## Directory Structure
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
|
11 |
+
* **src/**: Contains the source code for the project.
|
12 |
+
* **tools/**: Contains the code for the tools that can be used by the agents.
|
13 |
+
* **default\_tools/**: Contains the default tools provided with the project.
|
14 |
+
* **user\_tools/**: Contains the tools created by the user.
|
15 |
+
* **config/**: Contains configuration files for the project.
|
16 |
+
* **utils/**: Contains utility functions and classes used throughout the project.
|
17 |
+
* **models/**: Contains the configurations and system prompts for the agents. Includes `models.json` which stores agent definitions.
|
18 |
+
* **manager/**: Contains the core logic for managing agents, tools, and budgets.
|
19 |
+
* `agent_manager.py`: Manages the creation, deletion, and invocation of AI agents. Supports different agent types like Ollama, Gemini, and Groq.
|
20 |
+
* `budget_manager.py`: Manages the resource and expense budgets for the project.
|
21 |
+
* `tool_manager.py`: Manages the loading, running, and deletion of tools.
|
22 |
+
* `llm_models.py`: Defines abstract base classes for different language model integrations.
|
23 |
+
* **data/**: Contains data files, such as memory and secret words.
|
24 |
|
25 |
+
## Key Components
|
|
|
|
|
|
|
|
|
|
|
|
|
26 |
|
27 |
+
* **Agent Management:** The `AgentManager` class in `src/manager/agent_manager.py` is responsible for creating, managing, and invoking AI agents. It supports different agent types, including local (Ollama) and cloud-based (Gemini, Groq) models.
|
28 |
+
* **Tool Management:** The `ToolManager` class in `src/manager/tool_manager.py` handles the loading and running of tools. Tools are loaded from the `src/tools/default_tools` and `src/tools/user_tools` directories.
|
29 |
+
* **Budget Management:** The `BudgetManager` class in `src/manager/budget_manager.py` manages the resource and expense budgets for the project. It tracks the usage of resources and expenses and enforces budget limits.
|
30 |
+
* **Model Integration:** The project supports integration with various language models, including Ollama, Gemini, and Groq. The `llm_models.py` file defines abstract base classes for these integrations.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
31 |
|
32 |
## Usage
|
33 |
|
34 |
+
To use the project, you need to:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
|
36 |
+
1. Configure the budget in `src/manager/budget_manager.py`.
|
37 |
+
2. Create tools and place them in the `src/tools/default_tools` or `src/tools/user_tools` directories.
|
38 |
+
3. Create agents using the `AgentCreator` tool or the `AgentManager` class.
|
39 |
+
4. Invoke agents using the `AskAgent` tool or the `AgentManager` class.
|
40 |
+
5. Manage tools and agents using the `ToolManager` and `AgentManager` classes.
|
|
|
|
|
|
|
|
|
|
|
|
|
41 |
|
42 |
## Contributing
|
43 |
|
44 |
+
Contributions are welcome! Please submit pull requests with bug fixes, new features, or improvements to the documentation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
src/cost_benefit.py
DELETED
@@ -1,48 +0,0 @@
|
|
1 |
-
import argparse
|
2 |
-
import subprocess
|
3 |
-
import time
|
4 |
-
import requests
|
5 |
-
|
6 |
-
def detect_available_budget(runtime_env: str) -> int:
|
7 |
-
"""
|
8 |
-
Return an approximate VRAM‑based budget (MB) when running locally,
|
9 |
-
else default to 100.
|
10 |
-
"""
|
11 |
-
import torch
|
12 |
-
if "local" in runtime_env and torch.cuda.is_available():
|
13 |
-
total_vram_mb = torch.cuda.get_device_properties(0).total_memory // (1024 ** 2)
|
14 |
-
return min(total_vram_mb, 100)
|
15 |
-
return 100
|
16 |
-
|
17 |
-
def get_best_model(runtime_env: str, *, use_local_only: bool = False, use_api_only: bool = False) -> dict:
|
18 |
-
"""
|
19 |
-
Pick the fastest model that fits in the detected budget while
|
20 |
-
respecting the locality filters.
|
21 |
-
"""
|
22 |
-
static_costs = {
|
23 |
-
"llama3.2": {"size": 20, "token_cost": 0.0001, "tokens_sec": 30, "type": "local"},
|
24 |
-
"mistral": {"size": 40, "token_cost": 0.0002, "tokens_sec": 50, "type": "local"},
|
25 |
-
"gemini-2.0-flash": {"size": 60, "token_cost": 0.0005, "tokens_sec": 60, "type": "api"},
|
26 |
-
"gemini-2.5-pro-preview-03-25": {"size": 80, "token_cost": 0.002 , "tokens_sec": 45, "type": "api"},
|
27 |
-
}
|
28 |
-
|
29 |
-
budget = detect_available_budget(runtime_env)
|
30 |
-
best_model, best_speed = None, -1
|
31 |
-
|
32 |
-
for model, info in static_costs.items():
|
33 |
-
if info["size"] > budget:
|
34 |
-
continue
|
35 |
-
if use_local_only and info["type"] != "local":
|
36 |
-
continue
|
37 |
-
if use_api_only and info["type"] != "api":
|
38 |
-
continue
|
39 |
-
if info["tokens_sec"] > best_speed:
|
40 |
-
best_model, best_speed = model, info["tokens_sec"]
|
41 |
-
|
42 |
-
chosen = best_model or "llama3.2" # sensible default
|
43 |
-
return {
|
44 |
-
"model": chosen,
|
45 |
-
"token_cost": static_costs[chosen]["token_cost"],
|
46 |
-
"tokens_sec": static_costs[chosen]["tokens_sec"],
|
47 |
-
"note": None if best_model else "Defaulted because no model met the constraints",
|
48 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
src/models_cost.py
DELETED
@@ -1,114 +0,0 @@
|
|
1 |
-
from dataclasses import dataclass
|
2 |
-
from typing import Dict
|
3 |
-
from manager.utils.runtime_selector import detect_runtime_environment
|
4 |
-
|
5 |
-
@dataclass
|
6 |
-
class ModelInfo:
|
7 |
-
name: str
|
8 |
-
size: float
|
9 |
-
tokens_sec: int
|
10 |
-
type: str
|
11 |
-
description: str
|
12 |
-
create_cost: int = 0
|
13 |
-
invoke_cost: int = 0
|
14 |
-
|
15 |
-
class ModelRegistry:
|
16 |
-
def __init__(self):
|
17 |
-
self.env = detect_runtime_environment()
|
18 |
-
self.models = self._build_model_registry()
|
19 |
-
|
20 |
-
def estimate_create_cost(self, size: float, is_api: bool) -> int:
|
21 |
-
return int(size * (10 if is_api else 5))
|
22 |
-
|
23 |
-
def estimate_invoke_cost(self, tokens_sec: int, is_api: bool) -> int:
|
24 |
-
base_cost = 40 if is_api else 20
|
25 |
-
return base_cost + max(0, 60 - tokens_sec)
|
26 |
-
|
27 |
-
def _build_model_registry(self) -> Dict[str, ModelInfo]:
|
28 |
-
raw_models = {
|
29 |
-
"llama3.2": {
|
30 |
-
"size": 3,
|
31 |
-
"tokens_sec": 30,
|
32 |
-
"type": "local",
|
33 |
-
"description": "3B lightweight local model"
|
34 |
-
},
|
35 |
-
"mistral": {
|
36 |
-
"size": 7,
|
37 |
-
"tokens_sec": 50,
|
38 |
-
"type": "local",
|
39 |
-
"description": "7B stronger local model"
|
40 |
-
},
|
41 |
-
"gemini-2.0-flash": {
|
42 |
-
"size": 6,
|
43 |
-
"tokens_sec": 170,
|
44 |
-
"type": "api",
|
45 |
-
"description": "Fast and efficient API model"
|
46 |
-
},
|
47 |
-
"gemini-2.5-pro-preview-03-25": {
|
48 |
-
"size": 10,
|
49 |
-
"tokens_sec": 148,
|
50 |
-
"type": "api",
|
51 |
-
"description": "High-reasoning API model"
|
52 |
-
},
|
53 |
-
"gemini-1.5-flash": {
|
54 |
-
"size": 7,
|
55 |
-
"tokens_sec": 190,
|
56 |
-
"type": "api",
|
57 |
-
"description": "Fast general-purpose model"
|
58 |
-
},
|
59 |
-
"gemini-2.0-flash-lite": {
|
60 |
-
"size": 5,
|
61 |
-
"tokens_sec": 208,
|
62 |
-
"type": "api",
|
63 |
-
"description": "Low-latency, cost-efficient API model"
|
64 |
-
},
|
65 |
-
"gemini-2.0-flash-live-001": {
|
66 |
-
"size": 9,
|
67 |
-
"tokens_sec": 190,
|
68 |
-
"type": "api",
|
69 |
-
"description": "Voice/video low-latency API model"
|
70 |
-
}
|
71 |
-
}
|
72 |
-
|
73 |
-
|
74 |
-
models = {}
|
75 |
-
for name, model in raw_models.items():
|
76 |
-
is_api = model["type"] == "api"
|
77 |
-
|
78 |
-
if is_api:
|
79 |
-
# Flat cost for all API models
|
80 |
-
create_cost, invoke_cost = 20, 50
|
81 |
-
else:
|
82 |
-
create_cost = self.estimate_create_cost(model["size"], is_api=False)
|
83 |
-
invoke_cost = self.estimate_invoke_cost(model["tokens_sec"], is_api=False)
|
84 |
-
|
85 |
-
models[name] = ModelInfo(
|
86 |
-
name=name,
|
87 |
-
size=model["size"],
|
88 |
-
tokens_sec=model["tokens_sec"],
|
89 |
-
type=model["type"],
|
90 |
-
description=model["description"],
|
91 |
-
create_cost=create_cost,
|
92 |
-
invoke_cost=invoke_cost
|
93 |
-
)
|
94 |
-
return models
|
95 |
-
|
96 |
-
def get_filtered_models(self) -> Dict[str, ModelInfo]:
|
97 |
-
"""Return only models that match the current runtime."""
|
98 |
-
if self.env in ["gpu", "cpu-local"]:
|
99 |
-
return {k: v for k, v in self.models.items() if v.type == "local"}
|
100 |
-
else:
|
101 |
-
return {k: v for k, v in self.models.items() if v.type == "api"}
|
102 |
-
|
103 |
-
def get_all_models(self) -> Dict[str, ModelInfo]:
|
104 |
-
"""Return all models regardless of runtime."""
|
105 |
-
return self.models
|
106 |
-
|
107 |
-
|
108 |
-
if __name__ == "__main__":
|
109 |
-
registry = ModelRegistry()
|
110 |
-
print(f"[INFO] Detected runtime: {registry.env}\n")
|
111 |
-
|
112 |
-
print("Filtered models based on environment:")
|
113 |
-
for name, model in registry.get_filtered_models().items():
|
114 |
-
print(f"{name}: create={model.create_cost}, invoke={model.invoke_cost}, type={model.type}")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
src/test_env.py
DELETED
@@ -1,61 +0,0 @@
|
|
1 |
-
import os
|
2 |
-
from dotenv import load_dotenv
|
3 |
-
load_dotenv()
|
4 |
-
from src.manager import GeminiManager
|
5 |
-
from src.tool_manager import ToolManager
|
6 |
-
import gradio as gr
|
7 |
-
import base64
|
8 |
-
|
9 |
-
_logo_bytes = open("HASHIRU_LOGO.png", "rb").read()
|
10 |
-
_logo_b64 = base64.b64encode(_logo_bytes).decode()
|
11 |
-
_header_html = f"""
|
12 |
-
<div style="
|
13 |
-
display: flex;
|
14 |
-
flex-direction: column;
|
15 |
-
align-items: center;
|
16 |
-
padding-right: 24px;
|
17 |
-
">
|
18 |
-
<img src="data:image/png;base64,{_logo_b64}" width="20" height="20" />
|
19 |
-
<span style="margin-top: 8px; font-size: 20px; font-weight: bold; color: white;">
|
20 |
-
HASHIRU AI - Runtime Test
|
21 |
-
</span>
|
22 |
-
</div>
|
23 |
-
"""
|
24 |
-
|
25 |
-
# -------------------------------
|
26 |
-
# ToolManager Agent Creation
|
27 |
-
# -------------------------------
|
28 |
-
def create_agent_callback():
|
29 |
-
print("\n[INFO] Creating agent using ToolManager...")
|
30 |
-
manager = ToolManager()
|
31 |
-
|
32 |
-
response = manager.runTool("AgentCreator", {
|
33 |
-
"agent_name": "ui-runtime-agent",
|
34 |
-
"system_prompt": "You answer questions.",
|
35 |
-
"description": "Agent created via ToolManager",
|
36 |
-
# No base_model passed — will trigger dynamic selection
|
37 |
-
})
|
38 |
-
|
39 |
-
print("[TOOL RESPONSE]", response)
|
40 |
-
return response["message"]
|
41 |
-
|
42 |
-
# -------------------------------
|
43 |
-
# Gradio UI
|
44 |
-
# -------------------------------
|
45 |
-
if __name__ == "__main__":
|
46 |
-
css = """
|
47 |
-
#title-row { background: #2c2c2c; border-radius: 8px; padding: 8px; }
|
48 |
-
"""
|
49 |
-
with gr.Blocks(css=css, fill_width=True, fill_height=True) as demo:
|
50 |
-
with gr.Column():
|
51 |
-
gr.HTML(_header_html)
|
52 |
-
agent_create_button = gr.Button("🧪 Create Agent via ToolManager")
|
53 |
-
result_output = gr.Textbox(label="Tool Response")
|
54 |
-
|
55 |
-
agent_create_button.click(
|
56 |
-
fn=create_agent_callback,
|
57 |
-
inputs=[],
|
58 |
-
outputs=[result_output]
|
59 |
-
)
|
60 |
-
|
61 |
-
demo.launch()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
src/testing_cost.py
DELETED
@@ -1,15 +0,0 @@
|
|
1 |
-
# In tool_manager.py
|
2 |
-
from default_tools.test_cost.agent_creator_tool import AgentCreator
|
3 |
-
|
4 |
-
def test_agent_creation():
|
5 |
-
creator = AgentCreator()
|
6 |
-
response = creator.run(
|
7 |
-
agent_name="costbenefit-test-agent",
|
8 |
-
system_prompt="You are an expert assistant helping with cost-based model selection.",
|
9 |
-
description="Tests agent creation using cost-benefit logic."
|
10 |
-
)
|
11 |
-
print("\nTest Output:")
|
12 |
-
print(response)
|
13 |
-
|
14 |
-
if __name__ == "__main__":
|
15 |
-
test_agent_creation()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|