--- title: AI Content Summariser API emoji: 📝 colorFrom: blue colorTo: indigo sdk: docker app_port: 7860 pinned: false license: mit --- # AI Content Summariser API (Backend) This is the backend API for the AI Content Summariser, a tool that automatically generates concise summaries of articles, documents, and web content using natural language processing. The frontend application is available in a separate repository: [ai-content-summariser](https://github.com/dang-w/ai-content-summariser). ## Features - Text summarization using state-of-the-art NLP models (BART-large-CNN) - URL content extraction and summarization - Adjustable parameters for summary length and style - Efficient API endpoints with proper error handling ## API Endpoints - `POST /api/summarise` - Summarize text content - `POST /api/summarise-url` - Extract and summarize content from a URL ## Technology Stack - **Framework**: FastAPI for efficient API endpoints - **NLP Models**: Transformer-based models (BART) for summarisation - **Web Scraping**: BeautifulSoup4 for extracting content from URLs - **HTTP Client**: HTTPX for asynchronous web requests - **Deployment**: Hugging Face Spaces or Docker containers ## Getting Started ### Prerequisites - Python (v3.8+) - pip ### Installation ```bash # Clone the repository git clone https://github.com/dang-w/ai-content-summariser-api.git cd ai-content-summariser-api # Create a virtual environment python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # Install dependencies pip install -r requirements.txt ``` ### Running Locally ```bash # Start the backend server uvicorn main:app --reload ``` The API will be available at `http://localhost:8000`. ## Testing The project includes a comprehensive test suite covering both unit and integration tests. ### Installing Test Dependencies ```bash pip install pytest pytest-cov httpx ``` ### Running Tests ```bash # Run all tests pytest # Run tests with verbose output pytest -v # Run tests and generate coverage report pytest --cov=app tests/ # Run tests and generate detailed coverage report pytest --cov=app --cov-report=term-missing tests/ # Run specific test file pytest tests/test_api.py # Run tests without warnings pytest -W ignore::FutureWarning -W ignore::UserWarning ``` ### Test Structure - **Unit Tests**: Test individual components in isolation - `tests/test_summariser.py`: Tests for the summarization service - **Integration Tests**: Test API endpoints and component interactions - `tests/test_api.py`: Tests for API endpoints ### Mocking Strategy For faster and more reliable tests, we use mocking to avoid loading large ML models during testing: ```python # Example of mocked test def test_summariser_with_mock(): with patch('app.services.summariser.AutoTokenizer') as mock_tokenizer_class, \ patch('app.services.summariser.AutoModelForSeq2SeqLM') as mock_model_class: # Test implementation... ``` ### Continuous Integration Tests are automatically run on pull requests and pushes to the main branch using GitHub Actions. ## Running with Docker ```bash # Build and run with Docker docker build -t ai-content-summariser-api . docker run -p 8000:8000 ai-content-summariser-api ``` ## Deployment See the deployment guide in the frontend repository for detailed instructions on deploying both the frontend and backend components. ### Deploying to Hugging Face Spaces When deploying to Hugging Face Spaces, make sure to: 1. Set the following environment variables in the Space settings: - `TRANSFORMERS_CACHE=/tmp/huggingface_cache` - `HF_HOME=/tmp/huggingface_cache` - `HUGGINGFACE_HUB_CACHE=/tmp/huggingface_cache` 2. Use the Docker SDK in your Space settings 3. If you encounter memory issues, consider using a smaller model by changing the `model_name` in `summariser.py` ## Performance Optimizations The API includes several performance optimizations: 1. **Model Caching**: Models are loaded once and cached for subsequent requests 2. **Result Caching**: Frequently requested summaries are cached to avoid redundant processing 3. **Asynchronous Processing**: Long-running tasks are processed asynchronously ## Development ### Testing the API You can test the API endpoints using the built-in Swagger documentation at `/docs` when running locally. ### Checking Transformers Installation To verify that the transformers library is installed correctly: ```bash python -m app.check_transformers ``` ## License This project is licensed under the MIT License.