Safetensors
llava_onevision

This model (GameQA-LLaVA-OV-7b) results from training llava-onevision-qwen2-7b-ov-hf with GRPO solely on our GameQA dataset.

Evaluation Results on General Vision BenchMarks

Code2Logic: Game-Code-Driven Data Synthesis for Enhancing VLMs General Reasoning

This is the first work, to the best of our knowledge, that leverages game code to synthesize multimodal reasoning data for training VLMs. Furthermore, when trained with a GRPO strategy solely on GameQA (synthesized via our proposed Code2Logic approach), multiple cutting-edge open-source models exhibit significantly enhanced out-of-domain generalization.

[πŸ“– Paper] [πŸ€— GameQA-140K Dataset] [πŸ€— GameQA-InternVL3-8B ] [πŸ€— GameQA-Qwen2.5-VL-7B] [πŸ€— GameQA-LLaVA-OV-7B ]

News

  • We've open-sourced the three models trained with GRPO on GameQA on Huggingface.
Downloads last month
4
Safetensors
Model size
8.03B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Code2Logic/GameQA-llava-onevision-qwen2-7b-ov-hf

Finetuned
(3)
this model

Dataset used to train Code2Logic/GameQA-llava-onevision-qwen2-7b-ov-hf