|
# Gaia Agent Evaluation Guide |
|
|
|
This guide will walk you through the setup process for running the sample code and evaluating your agent using Gaia results. |
|
|
|
## Step 1: Configure API Keys |
|
|
|
Before anything else, make sure you configure your secret keys in the **Space Settings** section. |
|
|
|
- Log into each required platform. |
|
- Locate and input your API keys in the designated fields. |
|
|
|
## Step 2: Set Up Supabase |
|
|
|
1. **Log in to Supabase**. |
|
2. Navigate to your **space**, then go to your **project**. |
|
3. Open the **SQL Editor**, paste the SQL code below, and run it to create the necessary table and function. |
|
|
|
### 📦 SQL Code – Creating Tables and Functions |
|
|
|
```sql |
|
-- Enable pgvector if not already enabled |
|
create extension if not exists vector; |
|
|
|
-- Create the documents table (if not already done) |
|
create table if not exists documents ( |
|
id bigserial primary key, |
|
content text, |
|
metadata jsonb, |
|
embedding vector(768) -- Make sure this matches your model's embedding dimension |
|
); |
|
|
|
-- Create the match_documents function |
|
create or replace function match_documents ( |
|
query_embedding vector(768), |
|
match_count int default 5, |
|
filter jsonb default '{}' |
|
) |
|
returns table ( |
|
id bigint, |
|
content text, |
|
metadata jsonb, |
|
similarity float |
|
) |
|
language plpgsql |
|
as $$ |
|
begin |
|
return query |
|
select |
|
id, |
|
content, |
|
metadata, |
|
1 - (embedding <=> query_embedding) as similarity |
|
from documents |
|
where metadata @> filter |
|
order by embedding <=> query_embedding |
|
limit match_count; |
|
end; |
|
$$; |
|
``` |
|
4. After running the above, execute this command to ensure Supabase’s API layer (PostgREST) refreshes its internal schema cache: |
|
```sql |
|
NOTIFY pgrst, 'reload config'; |
|
``` |
|
## Step 3: Populate the Database |
|
|
|
To enable document retrieval, you need to populate the database with example entries: |
|
|
|
- Open and run the **test.ipynb** Jupyter notebook. |
|
|
|
- This script reads from the **metadata.jsonl** file and inserts the examples into the documents table. |
|
|
|
- This adds a Basic Retrieval capability to your agent, enhancing its performance. |
|
|
|
## Step 4: Run the Evaluation |
|
|
|
Once the database is set up and filled with data: |
|
|
|
- Proceed to the Evaluation section in your project. |
|
|
|
- Run the evaluation script to test and score your agent’s performance. |
|
|