2 94 30

Maozhou Ge

Gmc2

GHGmc2

AI & ML interests

None yet

Recent Activity

liked a model about 7 hours ago

deepseek-ai/DeepSeek-V3.1-Base

upvoted an article 10 days ago

From GRPO to DAPO and GSPO: What, Why, and How

upvoted a paper 13 days ago

Group Sequence Policy Optimization

View all activity

Organizations

None yet

liked a model about 7 hours ago

deepseek-ai/DeepSeek-V3.1-Base

685B • Updated about 19 hours ago • 556

upvoted an article 10 days ago

Article

From GRPO to DAPO and GSPO: What, Why, and How

•

11 days ago

• 11

upvoted a paper 13 days ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published 27 days ago • 289

upvoted a collection 13 days ago

Qwen3

Collection

84 items • Updated 14 days ago • 1.11k

liked a model 14 days ago

openai/gpt-oss-120b

Text Generation • 120B • Updated 6 days ago • 853k • • 3.49k

liked a model 23 days ago

physical-intelligence/fast

Robotics • Updated Jan 16 • 131

New activity in internlm/POLAR-7B 27 days ago

Any plan to open source the dataset?

#8 opened 27 days ago by

Gmc2

upvoted a paper about 1 month ago

Pre-Trained Policy Discriminators are General Reward Models

Paper • 2507.05197 • Published Jul 7 • 39

liked a Space about 1 month ago

Pipeline Parallelism Schedule Visualizer

📊

Visualize pipeline parallelism schedules

upvoted an article about 1 month ago

Article

Mixture of Depth is Vibe

•

Apr 22, 2024

• 48

upvoted an article 2 months ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

•

Oct 7, 2024

• 47

liked a model 3 months ago

deepseek-ai/DeepSeek-R1-0528

Text Generation • 685B • Updated May 29 • 432k • • 2.38k

upvoted a paper 3 months ago

Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

Paper • 2505.09343 • Published May 14 • 68

upvoted an article 4 months ago

Article

Vision Language Models Explained

and 1 other •

Apr 11, 2024

• 437

liked a dataset 4 months ago

hiyouga/geometry3k

Viewer • Updated Apr 14 • 3k • 17.2k • 39

liked a dataset 5 months ago

Dahoas/full-hh-rlhf

Viewer • Updated Feb 23, 2023 • 125k • 634 • 83

liked 2 models 5 months ago

Qwen/Qwen2.5-VL-32B-Instruct

Image-Text-to-Text • 33B • Updated Apr 14 • 444k • • 425

deepseek-ai/DeepSeek-V3-0324

Text Generation • 685B • Updated Mar 27 • 404k • • 3.04k

upvoted a collection 5 months ago

🌾Oat-Zero: Understanding R1-Zero-Like Training

Collection

5 items • Updated Apr 10 • 7

upvoted a paper 5 months ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 200

Maozhou Ge

AI & ML interests

Recent Activity

Organizations

Gmc2's activity

From GRPO to DAPO and GSPO: What, Why, and How

Any plan to open source the dataset?

Pipeline Parallelism Schedule Visualizer

Mixture of Depth is Vibe

Efficient LLM Pretraining: Packed Sequences and Masked Attention

Vision Language Models Explained