|
--- |
|
title: Sentence Transformers |
|
emoji: 🏢 |
|
colorFrom: green |
|
colorTo: gray |
|
sdk: gradio |
|
sdk_version: 5.33.1 |
|
app_file: app.py |
|
pinned: false |
|
--- |
|
|
|
# Sentence Transformers Demo |
|
|
|
Interactive web application for semantic text similarity analysis using Sentence Transformers models. |
|
|
|
## Features |
|
|
|
### 1. Paraphrase Mining |
|
- Find sentences with similar meaning in a text corpus |
|
- Support for multiple language models |
|
- Adjustable similarity threshold |
|
- Export results in CSV format |
|
|
|
### 2. Semantic Textual Similarity (STS) |
|
- Calculate semantic similarity between two sets of sentences |
|
- Uses advanced sentence transformation models |
|
- Compare sentences in different languages |
|
- Export results in CSV format |
|
|
|
## Available Models |
|
|
|
- [`Lajavaness/bilingual-embedding-large`](https://huggingface.co/Lajavaness/bilingual-embedding-large): Multilingual model optimized for multiple languages |
|
- [`sentence-transformers/all-mpnet-base-v2`](https://huggingface.co/sentence-transformers/all-mpnet-base-v2): High-quality general-purpose model |
|
- [`intfloat/multilingual-e5-large-instruct`](https://huggingface.co/intfloat/multilingual-e5-large-instruct): Multilingual model with instructions |
|
|
|
## Requirements |
|
|
|
- Python 3.8+ |
|
- Dependencies listed in `requirements.txt` |
|
|
|
## Installation |
|
|
|
1. Clone the repository: |
|
```bash |
|
git clone https://github.com/yourusername/sentence-transformers.git |
|
cd sentence-transformers |
|
``` |
|
|
|
2. Create and activate a virtual environment: |
|
```bash |
|
python -m venv venv |
|
source venv/bin/activate # Linux/Mac |
|
# or |
|
.\venv\Scripts\activate # Windows |
|
``` |
|
|
|
3. Install dependencies: |
|
```bash |
|
pip install -r requirements.txt |
|
``` |
|
|
|
## Usage |
|
|
|
1. Start the application: |
|
```bash |
|
python app.py |
|
``` |
|
|
|
2. Open your browser at `http://localhost:7860` |
|
|
|
3. Select the desired functionality: |
|
- Paraphrase Mining: Upload a CSV file with sentences to analyze |
|
- STS: Upload two CSV files with sentences to compare |
|
|
|
4. Select the model and adjust the similarity threshold |
|
|
|
5. Click "Process" to start the analysis |
|
|
|
6. Download results in CSV format |
|
|
|
## CSV File Format |
|
|
|
CSV files must contain a column named "text" with the sentences to analyze: |
|
|
|
```csv |
|
text |
|
"First sentence to analyze" |
|
"Second sentence to analyze" |
|
... |
|
``` |
|
|
|
## Notes |
|
|
|
- Temporary files are automatically cleaned up every 30 minutes |
|
- Using complete sentences is recommended for better results |
|
- Models may take time to load on first use |
|
|
|
## License |
|
|
|
MIT |
|
|
|
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |