Spaces:

ashish-soni08
/

Image-Captioning

Sleeping

App Files Files Community

ashish-soni08 commited on May 25

Commit

bec574b

verified ·

1 Parent(s): 00dc2e5

Update README.md

Browse files

Files changed (1) hide show

README.md +98 -17

README.md CHANGED Viewed

@@ -7,31 +7,112 @@ sdk: gradio
 sdk_version: 5.31.0
 app_file: app.py
 pinned: false
-license: afl-3.0
 ---
-# Image Captioning App
-This application provides a simple interface to generate captions for images using a pre-trained model from Hugging Face's Transformers library.
-## Features
-- **Image Captioning**: Automatically generate descriptive captions for uploaded images.
-- **User-Friendly Interface**: Built using Gradio for an easy-to-use web interface.
-## Model
-- **Model Used**: [Salesforce/blip-image-captioning-base](https://huggingface.co/Salesforce/blip-image-captioning-base)
-- **Framework**: Hugging Face Transformers
-## Software Packages
-- **Gradio**: Used to create the web interface.
-- **Transformers**: Used for model inference.
-- **Spaces**: Utilized for GPU acceleration during model execution.
-## How to Use
-1. Upload an image using the "Upload image" button.
-2. The app will automatically generate and display a caption for the image.
-3. The generated caption will appear in the textbox labeled "Caption".

 sdk_version: 5.31.0
 app_file: app.py
 pinned: false
+license: apache-2.0
 ---
+# Image Captioning App 🖼️📝
+A web-based image captioning tool that automatically generates descriptive captions for uploaded images using state-of-the-art computer vision models. Built with Gradio and deployed on Hugging Face Spaces.
+![Demo Screenshot](image-captioning-logo.png)
+## 🚀 Live Demo
+Try the app: [Image-Captioning](https://huggingface.co/spaces/ashish-soni08/Image-Captioning)
+## ✨ Features
+- **Automatic Caption Generation**: Upload any image and get descriptive captions instantly
+- **Visual Understanding**: AI model analyzes objects, scenes, and activities in images
+- **Clean Interface**: Intuitive web UI built with Gradio for seamless image uploads
+- **Responsive Design**: Works on desktop and mobile devices
+## 🛠️ Technology Stack
+- **Backend**: Python, Hugging Face Transformers
+- **Frontend**: Gradio
+- **Model**: [Salesforce/blip-image-captioning-base](https://huggingface.co/Salesforce/blip-image-captioning-base)
+- **Deployment**: Hugging Face Spaces
+## 🏃‍♂️ Quick Start
+### Prerequisites
+```bash
+Python 3.8+
+pip
+```
+### Installation
+1. Clone the repository:
+```bash
+git clone https://github.com/Ashish-Soni08/image-captioning-app.git
+cd image-captioning-app
+```
+2. Install dependencies:
+```bash
+pip install -r requirements.txt
+```
+3. Run the application:
+```bash
+python app.py
+```
+4. Open your browser and navigate to `http://localhost:7860`
+## 📋 Usage
+1. **Upload Image**: Click the "Upload image" button and select an image from your device
+2. **Generate Caption**: The app automatically processes the image and generates a caption
+3. **View Results**: The descriptive caption appears in the output textbox
+### Example
+**Input Image:**
+```
+[A photo of a golden retriever playing in a park]
+```
+**Generated Caption:**
+```
+"A golden retriever dog playing with a ball in a grassy park on a sunny day"
+```
+## 🧠 Model Information
+This app uses **Salesforce/blip-image-captioning-base**, a vision-language model for image captioning:
+- **Architecture**: BLIP with ViT-Base backbone
+- **Model Size**: ~990MB (PyTorch model file)
+- **Training Data**: COCO dataset with bootstrapped captions from web data
+- **Capabilities**: Both conditional and unconditional image captioning
+- **Performance**: State-of-the-art results on image captioning benchmarks (+2.8% CIDEr improvement)
+## 📁 Project Structure
+```
+image-captioning-app/
+├── app.py                 # Main Gradio application
+├── requirements.txt       # Python dependencies
+├── README.md             # Project documentation
+└── example_images/        # Sample images for testing
+```
+## 📄 License
+This project is licensed under the Apache License 2.0
+## 🙏 Acknowledgments
+- [Hugging Face](https://huggingface.co/) for the Transformers library and model hosting
+- [Gradio](https://gradio.app/) for the web interface framework
+- [Salesforce Research](https://github.com/salesforce/BLIP) for the BLIP model
+## 📞 Contact
+Ashish Soni - ashish.soni2091@gmail.com
+Project Link: [github](https://github.com/Ashish-Soni08/image-captioning-app)