Phi-Ground Tech Report: Advancing Perception in GUI Grounding Paper • 2507.23779 • Published 20 days ago • 41
ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL Paper • 2505.24875 • Published May 30 • 10
ReasonGen-R1 Collection Model and Datasets for the paper "ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RL • 7 items • Updated Jun 2 • 6
LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation Paper • 2411.04997 • Published Nov 7, 2024 • 40
LLM2CLIP Collection LLM2CLIP makes SOTA pretrained CLIP modal more SOTA ever. • 11 items • Updated May 1 • 60
ImageBrush: Learning Visual In-Context Instructions for Exemplar-Based Image Manipulation Paper • 2308.00906 • Published Aug 2, 2023 • 13