hashiruAI / bench

Commit History

Add benchmarking functionality for Globle game
cab9413

Kunal Pai commited on

Add benchmarking script for Wordle game
e09bf50

Kunal Pai commited on

Add benchmarking functionality for NYT Connections dataset
0577af4

Kunal Pai commited on

Add paper benchmarking, along with dataset for it
4f96523

Kunal Pai commited on

Refactor get_last_assistant_content function to improve response handling and support various response formats
81fafc1

Kunal Pai commited on

Refactor benchmarking script to implement HLE dataset performance evaluation and improve response handling
aa7e221

Kunal Pai commited on

Add benchmarking script for GlobleDistanceTool via Gradio API
97e9ed5

Kunal Pai commited on