new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Jun 5

Submitted by

tobiaslee

MiMo-VL Technical Report

·
74 authors

1

Submitted by

JC-Chen

Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning

·
10 authors

4

Submitted by

AlexeyKov

AmbiK: Dataset of Ambiguous Tasks in Kitchen Environment

·
5 authors

1

Submitted by

ahmedheakl

CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark

·
6 authors

5

Submitted by

thomasyyj

A Controllable Examination for Long-Context Language Models

·
7 authors

1

Submitted by

tsq2000

MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos

·
9 authors

1

Submitted by

bys0318

SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models

·
5 authors

1

Submitted by

tsq2000

Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis

·
6 authors

1

Submitted by

EtashGuha

OpenThoughts: Data Recipes for Reasoning Models

·
50 authors

Submitted by

tyhuang

Voyager: Long-Range and World-Consistent Video Diffusion for Explorable 3D Scene Generation

·
11 authors

1

Submitted by

yuanshengni

VisCoder: Fine-Tuning LLMs for Executable Python Visualization Code Generation

·
5 authors

2

Submitted by

Yuanze

IllumiCraft: Unified Geometry and Illumination Diffusion for Controllable Video Generation

·
5 authors

Submitted by

adamdad

Image Editing As Programs with Diffusion Models

·
5 authors

1

Submitted by

ubowang

Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem

·
5 authors

2

Submitted by

myhong

Ψ-Sampler: Initial Particle Sampling for SMC-Based Inference-Time Reward Alignment in Score Models

·
4 authors

1

Submitted by

xichenhku

LayerFlow: A Unified Model for Layer-aware Video Generation

·
6 authors

2

Submitted by

tricktreat

SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation

·
13 authors

Submitted by

Dazitu616

DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models

·
8 authors

1

Submitted by

Guiyang1001

TimeHC-RL: Temporal-aware Hierarchical Cognitive Reinforcement Learning for Enhancing LLMs' Social Intelligence

·
11 authors

1

Submitted by

sunyt32

Rectified Sparse Attention

·
9 authors

1

Submitted by

JaxChen

Beyond the Surface: Measuring Self-Preference in LLM Judgments

·
5 authors

1

Submitted by

dongminpark

Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games

·
16 authors

Submitted by

weiminwang

TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models

·
2 authors

Submitted by

chs20

Robustness in Both Domains: CLIP Needs a Robust Text Encoder

·
6 authors

1

Submitted by

KaituoFeng

Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback

·
7 authors

Submitted by

EunsuKim

BenchHub: A Unified Benchmark Suite for Holistic and Customizable LLM Evaluation

·
6 authors

1

Submitted by

yiren98

DiffDecompose: Layer-Wise Decomposition of Alpha-Composited Images via Diffusion Transformers

·
6 authors

1

Submitted by

JacobYuan

Adapt before Continual Learning

·
5 authors

1

Submitted by

westbrook

CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech

·
13 authors

Submitted by

mpatel57

RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions

·
5 authors

2

Submitted by

Franck-Dernoncourt

Quantitative LLM Judges

·
12 authors

1

Submitted by

NPBP26

Improving Knowledge Distillation Under Unknown Covariate Shift Through Confidence-Guided Data Augmentation

·
4 authors

1

Submitted by

yulichen

DLP: Dynamic Layerwise Pruning in Large Language Models

·
6 authors

1

Submitted by

RanjanSapkota

TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems

·
4 authors

Submitted by

xiao-qi

HTSC-2025: A Benchmark Dataset of Ambient-Pressure High-Temperature Superconductors for AI-Driven Critical Temperature Prediction

·
6 authors

2

Submitted by

ChengsongHuang

POSS: Position Specialist Generates Better Draft for Speculative Decoding

·
5 authors

Submitted by

j-min

Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning

·
4 authors

Submitted by

Franck-Dernoncourt

Follow the Flow: Fine-grained Flowchart Attribution with Neurosymbolic Agents

·
7 authors

1

Submitted by

brucelyu

Unleashing Hour-Scale Video Training for Long Video-Language Understanding

·
11 authors

Submitted by

Mountchicken

Rex-Thinker: Grounded Object Referring via Chain-of-Thought Reasoning

·
5 authors

1

Submitted by

JacobYuan

Rethinking the Stability-Plasticity Trade-off in Continual Learning from an Architectural Perspective

·
4 authors

1

Submitted by

FabianKarl

CRAWLDoc: A Dataset for Robust Ranking of Bibliographic Documents

·
2 authors

1

Submitted by

ZHZisZZ

VLMs Can Aggregate Scattered Training Patches

·
4 authors

1

Submitted by

Zhuohan

FinChain: A Symbolic Benchmark for Verifiable Chain-of-Thought Financial Reasoning

·
17 authors

1

Submitted by

gyr66

Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models

·
5 authors

1

Submitted by

xiaobinzhuang

Sounding that Object: Interactive Object-Aware Image to Audio Generation

·
9 authors

1

Submitted by

jgonsior

Survey of Active Learning Hyperparameters: Insights from a Large-Scale Experimental Grid

·
6 authors

1

Submitted by

xjcvcvxj

Robust Neural Rendering in the Wild with Asymmetric Dual 3D Gaussian Splatting

·
5 authors

1

Submitted by

juliuse

Solving Inverse Problems with FLAIR

·
6 authors

1

Submitted by

pbelcak

Small Language Models are the Future of Agentic AI

·
8 authors

Submitted by

JY-Young

RiOSWorld: Benchmarking the Risk of Multimodal Compter-Use Agents

·
4 authors

1