Spaces:
Running
on
Zero
Running
on
Zero
A newer version of the Gradio SDK is available:
5.33.2
metadata
title: Rex-Thinker Demo
emoji: 🔍
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.1
app_file: demo/app.py
pinned: false
license: apache-2.0
Rex-Thinker Demo
This is a demo application for Rex-Thinker-GRPO, a visual reasoning model that combines GroundingDINO for object detection with advanced referring expression comprehension.
Features
- Object Detection: Uses GroundingDINO to detect objects based on category names
- Referring Expression Comprehension: Identifies specific objects based on detailed descriptions
- Interactive Web Interface: Easy-to-use Gradio interface with real-time streaming
- Visual Reasoning: Shows the model's thinking process with detailed explanations
How to Use
- Upload an Image: Click on "Input Image" to upload your image
- Set Object Category: Enter the general category of objects you want to detect (e.g., "person", "car", "dog")
- Enter Referring Expression: Provide a detailed description of the specific object you want to identify (e.g., "person wearing red shirt and black hat")
- Adjust Visualization Settings: Modify draw width and font size for better visualization
- Run the Model: Click "Run with Streaming" to see the results
Examples
The demo includes several pre-loaded examples:
- Tomato detection
- Helmet identification
- Person in vehicle
- Text recognition on clothing
- Pet detection
Technical Details
- Base Model: Rex-Thinker-GRPO-7B
- Object Detection: GroundingDINO with SwinT backbone
- Framework: Gradio for web interface
- Inference: Supports streaming text generation
Model Information
Rex-Thinker-GRPO is a multimodal reasoning model that:
- Uses GroundingDINO to propose candidate object locations
- Applies visual reasoning to identify specific objects based on referring expressions
- Provides detailed explanations of its reasoning process
- Outputs precise bounding box coordinates for detected objects
For more information, visit the original repository.