|
--- |
|
title: Audio to Text |
|
emoji: ▶︎ •၊၊||၊|။||||။၊|• 0:10 ➤ 📄 |
|
|
|
colorFrom: blue |
|
colorTo: yellow |
|
sdk: gradio |
|
app_file: app.py |
|
pinned: false |
|
license: apache-2.0 |
|
--- |
|
|
|
# Whisper Small Model Demo |
|
|
|
This Space demonstrates the capabilities of OpenAI's Whisper small model for automatic speech recognition (ASR). Users |
|
can upload audio files or record audio directly to obtain transcriptions. |
|
|
|
## Overview |
|
|
|
Whisper is a state-of-the-art ASR model developed by OpenAI. This demo utilizes the small variant of Whisper to |
|
transcribe spoken language into text. The application is built using [Gradio](https://gradio.app/), which provides an |
|
intuitive web interface for machine learning models. |
|
|
|
## Features |
|
|
|
- **Audio Input**: Upload pre-recorded audio files or record audio in real-time. |
|
- **Transcription**: Generate text transcriptions of the input audio. |
|
- **Language Support**: Whisper supports multiple languages; however, this demo is optimized for English. |
|
|
|
## Usage |
|
|
|
1. **Select Input Method**: |
|
- *Upload*: Click on the "Upload" button to select an audio file from your device. |
|
- *Record*: Use the "Record" button to capture audio using your microphone. |
|
|
|
2. **Transcription**: |
|
- After providing the audio input, click on the "Transcribe" button. |
|
- The transcription will appear in the output box below. |
|
|
|
## Requirements |
|
|
|
To run this demo locally, ensure you have the following installed: |
|
|
|
- Python 3.8 or higher |
|
- Required Python packages listed in `requirements.txt` |
|
|
|
## Setup Instructions |
|
|
|
1. **Clone the Repository**: |
|
|
|
```bash |
|
git clone https://huggingface.co/spaces/your-username/whisper-small-demo |
|
cd whisper-small-demo |
|
``` |
|
|
|
2. **Install Dependencies**: |
|
|
|
```bash |
|
pip install -r requirements.txt |
|
``` |
|
|
|
3. **Run the Application**: |
|
|
|
```bash |
|
python app.py |
|
``` |
|
|
|
Access the demo locally at `http://localhost:7860`. |
|
|
|
## Acknowledgements |
|
|
|
- [OpenAI](https://openai.com/) for developing the Whisper model. |
|
- [Gradio](https://gradio.app/) for providing an easy-to-use interface for machine learning applications. |
|
- [Hugging Face Spaces](https://huggingface.co/spaces) for hosting this demo. |
|
|
|
## References |
|
|
|
- [OpenAI Whisper GitHub Repository](https://github.com/openai/whisper) |
|
- [Gradio Documentation](https://gradio.app/docs/) |
|
- [Hugging Face Spaces Documentation](https://huggingface.co/docs/hub/spaces) |