2 32 3

Shrey Pandit

SP2001

https://sites.google.com/view/shrey-pandit/home

AI & ML interests

None yet

Recent Activity

upvoted a paper 8 days ago

BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent

upvoted a paper 8 days ago

WideSearch: Benchmarking Agentic Broad Info-Seeking

upvoted a paper 25 days ago

Group Sequence Policy Optimization

View all activity

Organizations

upvoted 2 papers 8 days ago

BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent

Paper • 2508.06600 • Published 12 days ago • 36

WideSearch: Benchmarking Agentic Broad Info-Seeking

Paper • 2508.07999 • Published 9 days ago • 102

upvoted a paper 25 days ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published 27 days ago • 289

upvoted a paper 27 days ago

MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning

Paper • 2507.16812 • Published 29 days ago • 61

upvoted a paper 29 days ago

WebShaper: Agentically Data Synthesizing via Information-Seeking Formalization

Paper • 2507.15061 • Published about 1 month ago • 50

upvoted a paper about 1 month ago

GTA1: GUI Test-time Scaling Agent

Paper • 2507.05791 • Published Jul 8 • 25

upvoted a paper about 2 months ago

WebSailor: Navigating Super-human Reasoning for Web Agent

Paper • 2507.02592 • Published Jul 3 • 110

upvoted an article 3 months ago

Article

CodeAgents + Structure: A Better Way to Execute Actions

and 1 other •

May 28

• 71

upvoted 3 papers 3 months ago

Reasoning Model is Stubborn: Diagnosing Instruction Overriding in Reasoning Models

Paper • 2505.17225 • Published May 22 • 65

Teaching with Lies: Curriculum DPO on Synthetic Negatives for Hallucination Detection

Paper • 2505.17558 • Published May 23 • 15

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Paper • 2505.10554 • Published May 15 • 120

upvoted a paper 4 months ago

ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering

Paper • 2504.05506 • Published Apr 7 • 24

upvoted 3 papers 5 months ago

upvoted a collection 6 months ago

EgoLife

Collection

CVPR 2025 - EgoLife: Towards Egocentric Life Assistant. Homepage: https://egolife-ai.github.io/ • 10 items • Updated Mar 7 • 19

upvoted 2 papers 6 months ago

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published Mar 3 • 82

MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models

Paper • 2502.14302 • Published Feb 20 • 9

upvoted a paper 9 months ago

Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability

Paper • 2411.19943 • Published Nov 29, 2024 • 64

upvoted a paper 10 months ago

SFR-RAG: Towards Contextually Faithful LLMs

Paper • 2409.09916 • Published Sep 16, 2024 • 1

Shrey Pandit

AI & ML interests

Recent Activity

Organizations

SP2001's activity

CodeAgents + Structure: A Better Way to Execute Actions