llm-compare / README.md
francismurray's picture
modify README metadata
2818083
---
title: LLM-Compare
emoji: 💬
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 5.30.0
app_file: app.py
pinned: false
license: mit
short_description: Compare outputs from text-generation models side by side
models:
- HuggingFaceH4/zephyr-7b-beta
- NousResearch/Hermes-3-Llama-3.1-8B
- mistralai/Mistral-Nemo-Base-2407
- meta-llama/Llama-2-70b-hf
- aaditya/Llama3-OpenBioLLM-8B
---
<!-- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference -->
# LLM Comparison Tool
A Gradio web application that allows you to compare outputs from different Hugging Face models side by side.
## Features
- Compare outputs from two different LLMs simultaneously
- Simple and clean interface
- Support for multiple Hugging Face models
- Text generation using Hugging Face's Inference API
- Error handling and user feedback
## Setup
1. Clone this repository
2. Create and activate the conda environment:
```bash
conda env create -f environment.yml
conda activate llm-compare
```
3. Create a `.env` file in the root directory and add your Hugging Face API token:
```
HF_TOKEN=your_hugging_face_token_here
```
You can get your token from your [Hugging Face profile settings](https://huggingface.co/settings/tokens).
## Running the App
1. Make sure you have activated the conda environment:
```bash
conda activate llm-compare
```
2. Run the application:
```bash
python app.py
```
3. Open your browser and navigate to the URL shown in the terminal (typically `http://localhost:7860`)
## Usage
1. Enter your prompt in the text box
2. Select two different models from the dropdown menus
3. Click "Generate Responses" to see the outputs
4. The responses will appear in the chatbot interfaces below each model selection
## Models Available
- HuggingFaceH4/zephyr-7b-beta
- meta-llama/Llama-3.1-8B-Instruct
- microsoft/Phi-3.5-mini-instruct
- Qwen/QwQ-32B
## Notes
- Make sure you have a valid Hugging Face API token with appropriate permissions
- Response times may vary depending on the model size and server load