metadata
base_model:
- Wan-AI/Wan2.1-T2V-14B
datasets:
- BestWishYsh/OpenS2V-Eval
- BestWishYsh/OpenS2V-5M
language:
- en
license: apache-2.0
pipeline_tag: text-to-video
library_name: diffusers

OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation
If you like our project, please give us a star ⭐ on GitHub for the latest update.
✨ Summary
- New S2V Benchmark.
- We introduce OpenS2V-Eval for comprehensive evaluation of S2V models and propose three new automatic metrics aligned with human perception.
- New Insights for S2V Model Selection.
- Our evaluations using OpenS2V-Eval provide crucial insights into the strengths and weaknesses of various subject-to-video generation models.
- Million-Scale S2V Dataset.
- We create OpenS2V-5M, a dataset with 5.1M high-quality regular data and 0.35M Nexus Data, the latter is expected to address the three core challenges of subject-to-video.
💡 Description
- Repository: Code, Page, Dataset, Benchmark
- Paper: https://huggingface.co/papers/2505.20292
- Point of Contact: Shenghai Yuan
✏️ Citation
If you find our paper and code useful in your research, please consider giving a star and citation.
@article{yuan2025opens2v,
title={OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation},
author={Yuan, Shenghai and He, Xianyi and Deng, Yufan and Ye, Yang and Huang, Jinfa and Lin, Bin and Luo, Jiebo and Yuan, Li},
journal={arXiv preprint arXiv:2505.20292},
year={2025}
}