Spaces:
Running
Running
title: menu text detection | |
emoji: π¦ | |
colorFrom: indigo | |
colorTo: pink | |
sdk: gradio | |
python_version: 3.11 | |
short_description: Extract structured menu information from images into JSON... | |
tags: [ "donut","fine-tuning","image-to-text","transformer" ] | |
# Menu Text Detection System | |
Extract structured menu information from images into JSON using a fine-tuned Donut E2E model. | |
> Based on [Donut by Clova AI (ECCV β22)](https://github.com/clovaai/donut) | |
<div align="center"> | |
<img src="./assets/demo.gif" alt="demo" width="500"/><br> | |
[](https://huggingface.co/spaces/ryanlinjui/menu-text-detection)<br> | |
[](https://huggingface.co/collections/ryanlinjui/menu-text-detection-670ccf527626bb004bbfb39b) | |
</div> | |
## π Features | |
### Overview | |
Currently supports the following information from menu images: | |
- **Restaurant Name** | |
- **Business Hours** | |
- **Address** | |
- **Phone Number** | |
- **Dish Information** | |
- Name | |
- Price | |
> For the JSON schema, see [tools directory](./tools). | |
### Supported Methods to Extract Menu Information | |
- Fine-tuned Donut model | |
- OpenAI GPT API | |
- Google Gemini API | |
## π» Training / Fine-Tuning | |
### Setup | |
Use [uv](https://github.com/astral-sh/uv) to set up the development environment: | |
```bash | |
uv sync | |
``` | |
### Training Script (Datasets collecting, Fine-Tuning) | |
Please refer [`train.ipynb`](./train.ipynb). Use Jupyter Notebook for training: | |
```bash | |
uv run jupyter-notebook | |
``` | |
> For VSCode users, please install Jupyter extension, then select `.venv/bin/python` as your kernel. | |
### Run Demo Locally | |
```bash | |
uv run python app.py | |
``` | |