Dan Walsh
Updates to hugging face spaces config
b089011
|
raw
history blame
4.56 kB
metadata
title: AI Content Summariser API
emoji: πŸ“
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
license: mit

AI Content Summariser API (Backend)

This is the backend API for the AI Content Summariser, a tool that automatically generates concise summaries of articles, documents, and web content using natural language processing.

The frontend application is available in a separate repository: ai-content-summariser.

Features

  • Text summarization using state-of-the-art NLP models (BART-large-CNN)
  • URL content extraction and summarization
  • Adjustable parameters for summary length and style
  • Efficient API endpoints with proper error handling

API Endpoints

  • POST /api/summarise - Summarize text content
  • POST /api/summarise-url - Extract and summarize content from a URL

Technology Stack

  • Framework: FastAPI for efficient API endpoints
  • NLP Models: Transformer-based models (BART) for summarisation
  • Web Scraping: BeautifulSoup4 for extracting content from URLs
  • HTTP Client: HTTPX for asynchronous web requests
  • Deployment: Hugging Face Spaces or Docker containers

Getting Started

Prerequisites

  • Python (v3.8+)
  • pip

Installation

# Clone the repository
git clone https://github.com/dang-w/ai-content-summariser-api.git
cd ai-content-summariser-api

# Create a virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Running Locally

# Start the backend server
uvicorn main:app --reload

The API will be available at http://localhost:8000.

Testing

The project includes a comprehensive test suite covering both unit and integration tests.

Installing Test Dependencies

pip install pytest pytest-cov httpx

Running Tests

# Run all tests
pytest

# Run tests with verbose output
pytest -v

# Run tests and generate coverage report
pytest --cov=app tests/

# Run tests and generate detailed coverage report
pytest --cov=app --cov-report=term-missing tests/

# Run specific test file
pytest tests/test_api.py

# Run tests without warnings
pytest -W ignore::FutureWarning -W ignore::UserWarning

Test Structure

  • Unit Tests: Test individual components in isolation

    • tests/test_summariser.py: Tests for the summarization service
  • Integration Tests: Test API endpoints and component interactions

    • tests/test_api.py: Tests for API endpoints

Mocking Strategy

For faster and more reliable tests, we use mocking to avoid loading large ML models during testing:

# Example of mocked test
def test_summariser_with_mock():
    with patch('app.services.summariser.AutoTokenizer') as mock_tokenizer_class, \
         patch('app.services.summariser.AutoModelForSeq2SeqLM') as mock_model_class:
        # Test implementation...

Continuous Integration

Tests are automatically run on pull requests and pushes to the main branch using GitHub Actions.

Running with Docker

# Build and run with Docker
docker build -t ai-content-summariser-api .
docker run -p 8000:8000 ai-content-summariser-api

Deployment

See the deployment guide in the frontend repository for detailed instructions on deploying both the frontend and backend components.

Deploying to Hugging Face Spaces

When deploying to Hugging Face Spaces, make sure to:

  1. Set the following environment variables in the Space settings:

    • TRANSFORMERS_CACHE=/tmp/huggingface_cache
    • HF_HOME=/tmp/huggingface_cache
    • HUGGINGFACE_HUB_CACHE=/tmp/huggingface_cache
  2. Use the Docker SDK in your Space settings

  3. If you encounter memory issues, consider using a smaller model by changing the model_name in summariser.py

Performance Optimizations

The API includes several performance optimizations:

  1. Model Caching: Models are loaded once and cached for subsequent requests
  2. Result Caching: Frequently requested summaries are cached to avoid redundant processing
  3. Asynchronous Processing: Long-running tasks are processed asynchronously

Development

Testing the API

You can test the API endpoints using the built-in Swagger documentation at /docs when running locally.

Checking Transformers Installation

To verify that the transformers library is installed correctly:

python -m app.check_transformers

License

This project is licensed under the MIT License.