ShareGPT4Video dataset

For the text-to-video task, we sample 100 video captions from the ShareGPT4Video datset to feed to the diffusion model to generate videos.

Filtering the dataset

Download the dataset with captions and video paths.

wget https://huggingface.co/datasets/ShareGPT4Video/ShareGPT4Video/resolve/main/sharegpt4video_40k.jsonl

Sample video-caption pairs. The sampled dataset will be saved under sharegpt4video_100.json.

python sample.py