本项目旨在重现Sora（OpenAIT2V模型）们希望开源社区为本项目做出贡献

共228个文件

py：155个

sh：28个

md：12个

版权申诉

人工智能

161 浏览量 2024-04-10 17:42:24 上传评论收藏 540KB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

本项目旨在重现 Sora（Open AI T2V 模型）们希望开源社区为本项目做出贡献（228个子文件）

dockerfile.base 891B

curope.cpp 2KB

kernels.cu 4KB

.gitignore 157B

hostfile 68B

release.json 1KB

zero3_offload.json 955B

zero3.json 798B

zero2_offload.json 648B

zero2.json 553B

captions.json 173B

LICENSE 1KB

README.md 23KB

Report-v1.0.0.md 13KB

Report-v1.0.0-cn.md 11KB

Train_And_Eval_CausalVideoVAE.md 7KB

EVAL.md 4KB

Contribution_Guidelines.md 3KB

README.md 2KB

VQVAE.md 2KB

Data.md 1KB

README.md 1KB

readme.md 1KB

README.md 973B

placeholder 0B

run_docker.png 86KB

build_docker.png 63KB

modules.py 73KB

modeling_latte.py 61KB

train_t2v_t5_feature.py 38KB

train_t2v.py 37KB

train_t2v_feature.py 36KB

gaussian_diffusion_t2v.py 36KB

pipeline_videogen.py 35KB

train.py 35KB

gaussian_diffusion.py 34KB

modeling_causalvqvae.py 30KB

model.py 30KB

rgt_arch.py 28KB

modeling_vqvae.py 27KB

modeling_causalvae.py 22KB

quantize.py 18KB

losses.py 18KB

vqgan.py 17KB

utils.py 16KB

perceptual_loss.py 16KB

pwcnet.py 15KB

transform.py 15KB

base_model.py 15KB

transport.py 14KB

correlation.py 14KB

matlab_functions.py 14KB

pytorch_i3d.py 13KB

flolpips.py 13KB

arch_util.py 11KB

feat_enc.py 11KB

data_util.py 10KB

feature_datasets.py 9KB

rec_video_vae.py 9KB

sr_model.py 9KB

eval_common_metric.py 9KB

clip.py 8KB

t5.py 8KB

utils.py 8KB

eval_clip_score.py 8KB

respace.py 8KB

raft.py 8KB

AMT-G.py 8KB

transport_sample.py 8KB

sample_t2v.py 8KB

attention.py 8KB

updownsample.py 7KB

gradio_utils.py 7KB

logger.py 7KB

path.py 7KB

options.py 7KB

pretrained_networks.py 7KB

interpolation.py 6KB

transforms.py 6KB

img_util.py 6KB

vgg_arch.py 6KB

file_client.py 6KB

timestep_sampler.py 6KB

pos.py 6KB

rgt_model.py 6KB

rec_imvi_vae.py 5KB

caption_refiner.py 5KB

pos_embed.py 5KB

fvd.py 5KB

sample.py 5KB

run.py 5KB

paired_image_dataset.py 5KB

gradio_web_server.py 5KB

sky_datasets.py 5KB

train_causalvae.py 5KB

lpips.py 5KB

discriminator.py 5KB

misc.py 5KB

t2v_datasets.py 5KB

共 228 条

# Open-Sora Plan  [![slack badge](https://img.shields.io/badge/Discord-join-blueviolet?logo=discord&amp)](https://discord.gg/vqGmpjkSaz) [![WeChat badge](https://img.shields.io/badge/å¾®ä¿¡-å å¥-green?logo=wechat&amp)](https://github.com/PKU-YuanGroup/Open-Sora-Plan/issues/53#issuecomment-1987226516) [![Twitter](https://img.shields.io/badge/-Twitter@LinBin46984-black?logo=twitter&logoColor=1D9BF0)](https://x.com/LinBin46984/status/1763476690385424554?s=20) <br> [![hf_space](https://img.shields.io/badge/ð¤-Open%20In%20Spaces-blue.svg)](https://huggingface.co/spaces/LanguageBind/Open-Sora-Plan-v1.0.0) [![hf_space](https://img.shields.io/badge/ð¤-Open%20In%20Spaces-blue.svg)](https://huggingface.co/spaces/fffiloni/Open-Sora-Plan-v1-0-0) [![Replicate demo and cloud API](https://replicate.com/camenduru/open-sora-plan-512x512/badge)](https://replicate.com/camenduru/open-sora-plan-512x512) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/camenduru/Open-Sora-Plan-jupyter/blob/main/Open_Sora_Plan_jupyter.ipynb) <br> [![License](https://img.shields.io/badge/License-MIT-yellow)](https://github.com/PKU-YuanGroup/Open-Sora-Plan/blob/main/LICENSE) [![GitHub repo contributors](https://img.shields.io/github/contributors-anon/PKU-YuanGroup/Open-Sora-Plan?style=flat&label=Contributors)](https://github.com/PKU-YuanGroup/Open-Sora-Plan/graphs/contributors) [![GitHub Commit](https://img.shields.io/github/commit-activity/m/PKU-YuanGroup/Open-Sora-Plan?label=Commit)](https://github.com/PKU-YuanGroup/Open-Sora-Plan/commits/main/) [![Pr](https://img.shields.io/github/issues-pr-closed-raw/PKU-YuanGroup/Open-Sora-Plan.svg?label=Merged+PRs&color=green)](https://github.com/PKU-YuanGroup/Open-Sora-Plan/pulls) [![GitHub issues](https://img.shields.io/github/issues/PKU-YuanGroup/Open-Sora-Plan?color=critical&label=Issues)](https://github.com/PKU-YuanGroup/Video-LLaVA/issues?q=is%3Aopen+is%3Aissue) [![GitHub closed issues](https://img.shields.io/github/issues-closed/PKU-YuanGroup/Open-Sora-Plan?color=success&label=Issues)](https://github.com/PKU-YuanGroup/Video-LLaVA/issues?q=is%3Aissue+is%3Aclosed) <br> [![GitHub repo stars](https://img.shields.io/github/stars/PKU-YuanGroup/Open-Sora-Plan?style=flat&logo=github&logoColor=whitesmoke&label=Stars)](https://github.com/PKU-YuanGroup/Open-Sora-Plan/stargazers)  [![GitHub repo forks](https://img.shields.io/github/forks/PKU-YuanGroup/Open-Sora-Plan?style=flat&logo=github&logoColor=whitesmoke&label=Forks)](https://github.com/PKU-YuanGroup/Open-Sora-Plan/network)  [![GitHub repo watchers](https://img.shields.io/github/watchers/PKU-YuanGroup/Open-Sora-Plan?style=flat&logo=github&logoColor=whitesmoke&label=Watchers)](https://github.com/PKU-YuanGroup/Open-Sora-Plan/watchers)  [![GitHub repo size](https://img.shields.io/github/repo-size/PKU-YuanGroup/Open-Sora-Plan?style=flat&logo=github&logoColor=whitesmoke&label=Repo%20Size)](https://github.com/PKU-YuanGroup/Open-Sora-Plan/archive/refs/heads/main.zip) We are thrilled to present **Open-Sora-Plan v1.0.0**, which significantly enhances video generation quality and text control capabilities. See our [report](docs/Report-v1.0.0.md). We are training for higher resolution (>1024) as well as longer duration (>10s) videos, here is a preview of the next release. We show compressed .gif on github, which loses some quality. Thanks to **HUAWEI Ascend NPU Team** for supporting us. ç®åå·²æ¯æå½äº§AIè¯ç(åä¸ºæè¾910ï¼æå¾æ´å¤å½äº§ç®åè¯ç)è¿è¡æ¨çï¼ä¸ä¸æ¥å°æ¯æå½äº§ç®åè®ç»ï¼å·ä½å¯åèæè¾åæ¯[hw branch](https://github.com/PKU-YuanGroup/Open-Sora-Plan/tree/hw). | 257Ã512Ã512 (10s) | 65Ã1024Ã1024 (2.7s) | 65Ã1024Ã1024 (2.7s) | | --- | --- | --- | | <img src="https://github.com/PKU-YuanGroup/Open-Sora-Plan/assets/88202804/37c29fcb-47ba-4c6e-9ce8-612f0eab6634" width=224> | <img src="https://github.com/PKU-YuanGroup/Open-Sora-Plan/assets/88202804/6362c3ad-b1c4-4c36-8737-ad8a1e1dbed4" width=448> |<img src="https://github.com/PKU-YuanGroup/Open-Sora-Plan/assets/88202804/d90dd228-611b-44b7-93f4-fa99e224bd11" width=448> | | Time-lapse of a coastal landscape transitioning from sunrise to nightfall... | A quiet beach at dawn, the waves gently lapping at the shore and the sky painted in pastel hues....|Sunset over the sea. | | 65Ã512Ã512 (2.7s) | 65Ã512Ã512 (2.7s) | 65Ã512Ã512 (2.7s) | | --- | --- | --- | | <img src="https://github.com/PKU-YuanGroup/Open-Sora-Plan/assets/88202804/deca421b-dbc5-4d16-a80b-89c1d8b4fce7" width=224> | <img src="https://github.com/PKU-YuanGroup/Open-Sora-Plan/assets/88202804/7cddd996-7c17-4d8e-a47d-e57c0930a91d" width=224> | <img src="https://github.com/PKU-YuanGroup/Open-Sora-Plan/assets/88202804/029ed424-e977-470b-a39d-ebc2d3e61c1c" width=224> | | A serene underwater scene featuring a sea turtle swimming... | Yellow and black tropical fish dart through the sea. | a dynamic interaction between the ocean and a large rock... | | <img src="https://github.com/PKU-YuanGroup/Open-Sora-Plan/assets/88202804/900e7293-9c7c-4844-b7e7-c0b0b9f7e055" width=224> | <img src="https://github.com/PKU-YuanGroup/Open-Sora-Plan/assets/88202804/a710d498-5f43-4553-be12-e80f9d5b442e" width=224> | <img src="https://github.com/PKU-YuanGroup/Open-Sora-Plan/assets/88202804/1d350503-98f6-4e88-8802-2dd915357726" width=224> | | The dynamic movement of tall, wispy grasses swaying in the wind... | Slow pan upward of blazing oak fire in an indoor fireplace. | A serene waterfall cascading down moss-covered rocks... | ## ðª Goal This project aims to create a simple and scalable repo, to reproduce [Sora](https://openai.com/sora) (OpenAI, but we prefer to call it "ClosedAI" ). We wish the open-source community can contribute to this project. Pull requests are welcome!!! æ¬é¡¹ç®å¸æéè¿å¼æºç¤¾åºçåéå¤ç°Soraï¼ç±åå¤§-åå±AIGCèåå®éªå®¤å±ååèµ·ï¼å½åçæ¬ç¦»ç®æ å·®è·ä»ç¶è¾å¤§ï¼ä»éæç»å®ååå¿«éè¿ä»£ï¼æ¬¢è¿Pull requestï¼ï¼ï¼ Project stages: - Primary 1. Setup the codebase and train a un-conditional model on a landscape dataset. 2. Train models that boost resolution and duration. - Extensions 3. Conduct text2video experiments on landscape dataset. 4. Train the 1080p model on video2text dataset. 5. Control model with more conditions. <div style="display: flex; justify-content: center;"> <img src="https://github.com/PKU-YuanGroup/Open-Sora-Plan/assets/88202804/6b3095e9-88e8-4481-9b1b-ff9aaa25caf1" width=200> <img src="https://github.com/PKU-YuanGroup/Open-Sora-Plan/assets/88202804/f0a2ebca-6d25-4f94-be29-bd0a29cd9230" width=600> </div> ## ð° News **[2024.04.09]** ð Excited to share our latest exploration on metamorphic time-lapse video generation: [MagicTime](https://github.com/PKU-YuanGroup/MagicTime), which learns real-world physics knowledge from time-lapse videos. Here is the dataset for train (updating): [Open-Sora-Dataset](https://github.com/PKU-YuanGroup/Open-Sora-Dataset). **[2024.04.07]** ð¥ð¥ð¥ Today, we are thrilled to present Open-Sora-Plan v1.0.0, which significantly enhances video generation quality and text control capabilities. See our [report](docs/Report-v1.0.0.md). Thanks to HUAWEI NPU for supporting us. **[2024.03.27]** ððð We release the report of [VideoCausalVAE](docs/CausalVideoVAE.md), which supports both images and videos. We present our reconstructed video in this demonstration as follows. The text-to-video model is on the way. **[2024.03.10]** ððð This repo supports training a latent size of 225Ã90Ã90 (tÃhÃw), which means we are able to **train 1 minute of 1080P video with 30FPS** (2Ã interpolated frames and 2Ã super resolution) under class-condition. **[2024.03.08]** We

评论收藏

内容反馈

版权申诉