基础模型:
- stabilityai/stable-video-diffusion-img2vid-xt-1-1
任务类型: 图像转视频
数据集:
- TaiMingLu/Genex-DB-World-Exploration
许可协议: cc-by-4.0
世界探索生成器 🚀🌍
世界探索生成器是基于稳定视频扩散模型(SVD)构建的视频生成流程。它接收关键帧并生成时间连贯的视频。此探索版本通过自定义UNet时空条件模型
对SVD进行了增强。
该扩散器会沿着全景输入图像生成前进路径,用以探索给定场景。

📦 使用方法
from diffusers import UNetSpatioTemporalConditionModel, StableVideoDiffusionPipeline
import torch
from PIL import Image
model_id = 'genex-world/GenEx-World-Explorer'
unet = UNetSpatioTemporalConditionModel.from_pretrained(
model_id,
subfolder='unet',
torch_dtype=torch.float16,
low_cpu_mem_usage=True,
)
pipe = StableVideoDiffusionPipeline.from_pretrained(
model_id,
unet=unet,
low_cpu_mem_usage=True,
torch_dtype=torch.float16,
local_files_only=True,
).to('cuda')
image = Image.open('example.png').resize((1024, 576), Image.BICUBIC).convert('RGB')
generator = torch.manual_seed(-1)
with torch.inference_mode():
frames = self.pipe(image,
num_frames=25,
width=1024,
height=576,
decode_chunk_size=8, generator=generator, motion_bucket_id=127, fps=7, num_inference_steps=30, noise_aug_strength=0.02).frames[0]
🔧 系统要求
diffusers>=0.33.1
transformers
numpy
pillow
✨ 文献引用
@misc{lu2025genexgeneratingexplorableworld,
title={GenEx: 生成可探索世界},
author={Taiming Lu and Tianmin Shu and Junfei Xiao and Luoxin Ye and Jiahao Wang and Cheng Peng and Chen Wei and Daniel Khashabi and Rama Chellappa and Alan Yuille and Jieneng Chen},
year={2025},
eprint={2412.09624},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.09624},
}