Carlos

Carlosvirella100

AI & ML interests

Since I'm the owner and president of Higging Face I want to develop every project that and use every logic within my portfolio.

Recent Activity

updated a collection about 13 hours ago

CAMV

upvoted a paper about 13 hours ago

VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models

updated a collection about 15 hours ago

CAMV

View all activity

Organizations

Carlosvirella100's activity

upvoted a paper about 13 hours ago

VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models

Paper • 2505.23656 • Published 8 days ago • 23

upvoted a paper about 15 hours ago

CSVQA: A Chinese Multimodal Benchmark for Evaluating STEM Reasoning Capabilities of VLMs

Paper • 2505.24120 • Published 8 days ago • 47

upvoted 3 papers about 20 hours ago

DiffDecompose: Layer-Wise Decomposition of Alpha-Composited Images via Diffusion Transformers

Paper • 2505.21541 • Published 13 days ago • 7

From Token to Action: State Machine Reasoning to Mitigate Overthinking in Information Retrieval

Paper • 2505.23059 • Published 9 days ago • 13

Sparse-vDiT: Unleashing the Power of Sparse Attention to Accelerate Video Diffusion Transformers

Paper • 2506.03065 • Published 3 days ago • 27

upvoted a paper about 23 hours ago

OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation

Paper • 2506.02397 • Published 4 days ago • 33

upvoted an article about 23 hours ago

Article

KV Cache from scratch in nanoVLM

and 4 others •

3 days ago

• 54

upvoted 2 papers 1 day ago

Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models

Paper • 2506.01413 • Published 4 days ago • 14

DINGO: Constrained Inference for Diffusion LLMs

Paper • 2505.23061 • Published 9 days ago • 26

upvoted 4 papers 2 days ago

LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon Embodied Tasks

Paper • 2506.00411 • Published 7 days ago • 27

Visual Embodied Brain: Let Multimodal Large Language Models See, Think, and Control in Spaces

Paper • 2506.00123 • Published 7 days ago • 31

SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis

Paper • 2506.02096 • Published 4 days ago • 49

More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models

Paper • 2505.21523 • Published 15 days ago • 14

upvoted 7 papers 3 days ago

Knowledge Navigator: LLM-guided Browsing Framework for Exploratory Search in Scientific Literature

Paper • 2408.15836 • Published Aug 28, 2024 • 14

"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization

Paper • 2411.02355 • Published Nov 4, 2024 • 52