RainingXY's picture

RainingXY

xxyyy123

·

AI & ML interests

None yet

Recent Activity

new activity about 4 hours ago

AIDC-AI/Ovis2.5-9B:Fine-Tuning?

authored a paper 1 day ago

LPO: Towards Accurate GUI Agent Interaction via Location Preference Optimization

authored a paper 1 day ago

Ovis2.5 Technical Report

View all activity

Organizations

upvoted a paper 1 day ago

Ovis2.5 Technical Report

Paper • 2508.11737 • Published 5 days ago • 92

upvoted a collection 5 days ago

Ovis2.5

Our next-generation MLLMs for native-resolution vision and advanced reasoning • 5 items • Updated 1 day ago • 51

upvoted a collection 15 days ago

gpt-oss

Open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases. • 2 items • Updated 14 days ago • 308

upvoted a collection about 2 months ago

Ovis-U1

An unified model for multimodal understanding, text-to-image generation, and image editing. • 3 items • Updated Jul 2 • 4

upvoted a paper about 2 months ago

Ovis-U1 Technical Report

Paper • 2506.23044 • Published Jun 29 • 61

upvoted a paper 2 months ago

ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development

Paper • 2506.05010 • Published Jun 5 • 76

upvoted a paper 3 months ago

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published May 5 • 79

upvoted 4 papers 5 months ago

UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning

Paper • 2503.21620 • Published Mar 27 • 63

Unlocking Efficient Long-to-Short LLM Reasoning with Model Merging

Paper • 2503.20641 • Published Mar 26 • 9

Next Token Is Enough: Realistic Image Quality and Aesthetic Scoring with Multimodal Large Language Model

Paper • 2503.06141 • Published Mar 8 • 4

WritingBench: A Comprehensive Benchmark for Generative Writing

Paper • 2503.05244 • Published Mar 7 • 19

upvoted 3 papers 6 months ago

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published Feb 25 • 75

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published Feb 16 • 165

MM-RLHF: The Next Step Forward in Multimodal LLM Alignment

Paper • 2502.10391 • Published Feb 14 • 35

upvoted a collection 6 months ago

Ovis2

Our latest advancement in multi-modal large language models (MLLMs) • 15 items • Updated Mar 25 • 65

upvoted 5 papers 8 months ago

IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization

Paper • 2411.06208 • Published Nov 9, 2024 • 21

SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning

Paper • 2411.10161 • Published Nov 15, 2024 • 9

VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models

Paper • 2412.01822 • Published Dec 2, 2024 • 15

VisionArena: 230K Real World User-VLM Conversations with Preference Labels

Paper • 2412.08687 • Published Dec 11, 2024 • 13

CompCap: Improving Multimodal Large Language Models with Composite Captions

Paper • 2412.05243 • Published Dec 6, 2024 • 19