Josh Fourie's picture

61 7

Josh Fourie

JoshFourie

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

A Survey on Vision-Language-Action Models for Autonomous Driving

upvoted a paper about 1 month ago

Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data

upvoted a paper about 2 months ago

AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models

View all activity

Organizations

None yet

upvoted 2 papers about 1 month ago

A Survey on Vision-Language-Action Models for Autonomous Driving

Paper • 2506.24044 • Published Jun 30 • 14

Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data

Paper • 2507.07095 • Published Jul 9 • 54

upvoted 13 papers about 2 months ago

AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models

Paper • 2506.19851 • Published Jun 24 • 58

Matrix-Game: Interactive World Foundation Model

Paper • 2506.18701 • Published Jun 23 • 72

DualTHOR: A Dual-Arm Humanoid Simulation Platform for Contingency-Aware Planning

Paper • 2506.16012 • Published Jun 19 • 22

Whole-Body Conditioned Egocentric Video Prediction

Paper • 2506.21552 • Published Jun 26 • 11

SAM4D: Segment Anything in Camera and LiDAR Streams

Paper • 2506.21547 • Published Jun 26 • 16

MADrive: Memory-Augmented Driving Scene Modeling

Paper • 2506.21520 • Published Jun 26 • 36

Ambient Diffusion Omni: Training Good Models with Bad Data

Paper • 2506.10038 • Published Jun 10 • 9

GMT: General Motion Tracking for Humanoid Whole-Body Control

Paper • 2506.14770 • Published Jun 17 • 8

ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies

Paper • 2506.14315 • Published Jun 17 • 10

Sekai: A Video Dataset towards World Exploration

Paper • 2506.15675 • Published Jun 18 • 64

EmoNet-Voice: A Fine-Grained, Expert-Verified Benchmark for Speech Emotion Detection

Paper • 2506.09827 • Published Jun 11 • 18

Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition

Paper • 2506.17201 • Published Jun 20 • 55

DreamCube: 3D Panorama Generation via Multi-plane Synchronization

Paper • 2506.17206 • Published Jun 20 • 22

upvoted 3 papers 2 months ago

Video World Models with Long-term Spatial Memory

Paper • 2506.05284 • Published Jun 5 • 53

RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics

Paper • 2506.04308 • Published Jun 4 • 43

MotionSight: Boosting Fine-Grained Motion Understanding in Multimodal LLMs

Paper • 2506.01674 • Published Jun 2 • 28

upvoted 2 papers 3 months ago

Large Language Models are Locally Linear Mappings

Paper • 2505.24293 • Published May 30 • 15

Vision Language Models are Biased

Paper • 2505.23941 • Published May 29 • 22