metadata
license: apache-2.0
language:
- en
base_model:
- meta-llama/Llama-3.2-11B-Vision-Instruct
datasets:
- Xkev/LLaVA-CoT-100k
pipeline_tag: image-text-to-text
library_name: transformers
Sherlock: Self-Correcting Reasoning in Vision-Language Models
Introduction
Sherlock is a training framework focus on improving Vision-Language Models reasoning and self-correction capabilities.
GitHub repo: https://github.com/DripNowhy/Sherlock
Project Page: https://dripnowhy.github.io/Sherlock/