saisha09 commited on
Commit
4c7aac8
·
1 Parent(s): 838a833

README.md updated

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md CHANGED
@@ -5,3 +5,20 @@
5
 
6
  ## Overview
7
  HASHIRU is an agent-based framework designed to dynamically allocate and manage large language models (LLMs) and external APIs through a CEO model. The CEO model acts as a central manager, capable of hiring, firing, and directing multiple specialized agents (employees) over a given budget. It can also create and utilize external APIs as needed, making it highly flexible and scalable.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
 
6
  ## Overview
7
  HASHIRU is an agent-based framework designed to dynamically allocate and manage large language models (LLMs) and external APIs through a CEO model. The CEO model acts as a central manager, capable of hiring, firing, and directing multiple specialized agents (employees) over a given budget. It can also create and utilize external APIs as needed, making it highly flexible and scalable.
8
+
9
+ ## Features
10
+ - **Cost-Benefit Matrix**:
11
+ Selects the best LLM model (LLaMA, Mixtral, Gemini, DeepSeek, etc.) for any task using Ollama, based on latency, size, cost, quality, and speed.
12
+ ## Usage:
13
+
14
+ ```bash
15
+ python tools/cost_benefit.py \
16
+ --prompt "Best places to visit in Davis" \
17
+ --latency 4 --size 2 --cost 5 --speed 3
18
+ ```
19
+ Each weight is on a scale of **1** (least important) to **5** (most important):
20
+
21
+ - `--latency`: Prefer faster responses (lower time to answer)
22
+ - `--size`: Prefer smaller models (use less memory/resources)
23
+ - `--cost`: Prefer cheaper responses (fewer tokens, lower token price)
24
+ - `--speed`: Prefer models that generate tokens quickly (tokens/sec)