stable-diffusion-webui安装包，亲测有效_stablediffusionwebui资源-CSDN文库

共753个文件

py：332个

sample：72个

png：68个

webui

图像生成

64 浏览量 2023-07-19 14:06:48 上传评论收藏 222.55MB ZIP 举报

资源推荐

资源详情

资源评论

收起资源包目录

stable-diffusion-webui安装包，亲测有效（753个子文件）

webui.bat 2KB

webui-user.bat 84B

setup.cfg 575B

CODEOWNERS 657B

CODEOWNERS 140B

config 284B

config 280B

config 272B

config 269B

config 265B

config 259B

deform_conv_cuda.cpp 28KB

deform_conv_ext.cpp 7KB

fused_bias_act.cpp 1KB

upfirdn2d.cpp 1KB

style.css 18KB

style.css 1KB

deform_conv_cuda_kernel.cu 42KB

upfirdn2d_kernel.cu 12KB

fused_bias_act_kernel.cu 3KB

description 73B

.DS_Store 6KB

.eslintignore 48B

exclude 240B

BLIP.gif 6.4MB

d2i.gif 1.09MB

inpainting.gif 865KB

.git-blame-ignore-revs 55B

.gitignore 3KB

.gitignore 1KB

.gitignore 495B

.gitignore 90B

.gitignore 18B

.gitkeep 0B

HEAD 397B

HEAD 392B

HEAD 388B

HEAD 384B

HEAD 207B

HEAD 199B

HEAD 195B

HEAD 188B

HEAD 184B

HEAD 182B

HEAD 41B

HEAD 32B

HEAD 30B

HEAD 23B

HEAD 21B

licenses.html 36KB

footer.html 556B

extra-networks-card.html 495B

extra-networks-no-cards.html 119B

pack-38a0699c9046405afdbd0a929c3ec13dbc652976.idx 650KB

pack-d13ca26cdb05a1e03a0796e167e189d620ce6e19.idx 21KB

pack-68f59ce0044ecd863a79c771bf459a533971f6a6.idx 17KB

pack-a8002039488e7430cdfb00a1fe265d0051bd5679.idx 17KB

pack-46527a571975f5e1b95cacd65489fa79b3ba8fa0.idx 9KB

pack-bbab3f891e60b2180f1a807dfe9fc6060ba98d9b.idx 5KB

index 23KB

index 15KB

index 13KB

index 4KB

index 3KB

index 540B

demo.ipynb 671KB

oldcar800.jpeg 290KB

oldcar000.jpeg 283KB

oldcar500.jpeg 267KB

depth2fantasy.jpeg 254KB

rick.jpeg 227KB

houses_out.jpeg 175KB

plates_out.jpeg 115KB

midas.jpeg 39KB

network.jpg 233KB

sketch-mountains-input.jpg 174KB

panda.jpg 171KB

model-variants.jpg 75KB

00.jpg 26KB

03.jpg 15KB

共 753 条

# Stable Diffusion Version 2 ![t2i](assets/stable-samples/txt2img/768/merged-0006.png) ![t2i](assets/stable-samples/txt2img/768/merged-0002.png) ![t2i](assets/stable-samples/txt2img/768/merged-0005.png) This repository contains [Stable Diffusion](https://github.com/CompVis/stable-diffusion) models trained from scratch and will be continuously updated with new checkpoints. The following list provides an overview of all currently available models. More coming soon. ## News **March 24, 2023** *Stable UnCLIP 2.1* - New stable diffusion finetune (_Stable unCLIP 2.1_, [Hugging Face](https://huggingface.co/stabilityai/)) at 768x768 resolution, based on SD2.1-768. This model allows for image variations and mixing operations as described in [*Hierarchical Text-Conditional Image Generation with CLIP Latents*](https://arxiv.org/abs/2204.06125), and, thanks to its modularity, can be combined with other models such as [KARLO](https://github.com/kakaobrain/karlo). Comes in two variants: [*Stable unCLIP-L*](https://huggingface.co/stabilityai/stable-diffusion-2-1-unclip/blob/main/sd21-unclip-l.ckpt) and [*Stable unCLIP-H*](https://huggingface.co/stabilityai/stable-diffusion-2-1-unclip/blob/main/sd21-unclip-h.ckpt), which are conditioned on CLIP ViT-L and ViT-H image embeddings, respectively. Instructions are available [here](doc/UNCLIP.MD). - A public demo of SD-unCLIP is already available at [clipdrop.co/stable-diffusion-reimagine](https://clipdrop.co/stable-diffusion-reimagine) **December 7, 2022** *Version 2.1* - New stable diffusion model (_Stable Diffusion 2.1-v_, [Hugging Face](https://huggingface.co/stabilityai/stable-diffusion-2-1)) at 768x768 resolution and (_Stable Diffusion 2.1-base_, [HuggingFace](https://huggingface.co/stabilityai/stable-diffusion-2-1-base)) at 512x512 resolution, both based on the same number of parameters and architecture as 2.0 and fine-tuned on 2.0, on a less restrictive NSFW filtering of the [LAION-5B](https://laion.ai/blog/laion-5b/) dataset. Per default, the attention operation of the model is evaluated at full precision when `xformers` is not installed. To enable fp16 (which can cause numerical instabilities with the vanilla attention module on the v2.1 model) , run your script with `ATTN_PRECISION=fp16 python <thescript.py>` **November 24, 2022** *Version 2.0* - New stable diffusion model (_Stable Diffusion 2.0-v_) at 768x768 resolution. Same number of parameters in the U-Net as 1.5, but uses [OpenCLIP-ViT/H](https://github.com/mlfoundations/open_clip) as the text encoder and is trained from scratch. _SD 2.0-v_ is a so-called [v-prediction](https://arxiv.org/abs/2202.00512) model. - The above model is finetuned from _SD 2.0-base_, which was trained as a standard noise-prediction model on 512x512 images and is also made available. - Added a [x4 upscaling latent text-guided diffusion model](#image-upscaling-with-stable-diffusion). - New [depth-guided stable diffusion model](#depth-conditional-stable-diffusion), finetuned from _SD 2.0-base_. The model is conditioned on monocular depth estimates inferred via [MiDaS](https://github.com/isl-org/MiDaS) and can be used for structure-preserving img2img and shape-conditional synthesis. ![d2i](assets/stable-samples/depth2img/depth2img01.png) - A [text-guided inpainting model](#image-inpainting-with-stable-diffusion), finetuned from SD _2.0-base_. We follow the [original repository](https://github.com/CompVis/stable-diffusion) and provide basic inference scripts to sample from the models. ________________ *The original Stable Diffusion model was created in a collaboration with [CompVis](https://arxiv.org/abs/2202.00512) and [RunwayML](https://runwayml.com/) and builds upon the work:* [**High-Resolution Image Synthesis with Latent Diffusion Models**](https://ommer-lab.com/research/latent-diffusion-models/)<br/> [Robin Rombach](https://github.com/rromb)\*, [Andreas Blattmann](https://github.com/ablattmann)\*, [Dominik Lorenz](https://github.com/qp-qp)\, [Patrick Esser](https://github.com/pesser), [Björn Ommer](https://hci.iwr.uni-heidelberg.de/Staff/bommer)<br/> _[CVPR '22 Oral](https://openaccess.thecvf.com/content/CVPR2022/html/Rombach_High-Resolution_Image_Synthesis_With_Latent_Diffusion_Models_CVPR_2022_paper.html) | [GitHub](https://github.com/CompVis/latent-diffusion) | [arXiv](https://arxiv.org/abs/2112.10752) | [Project page](https://ommer-lab.com/research/latent-diffusion-models/)_ and [many others](#shout-outs). Stable Diffusion is a latent text-to-image diffusion model. ________________________________ ## Requirements You can update an existing [latent diffusion](https://github.com/CompVis/latent-diffusion) environment by running ``` conda install pytorch==1.12.1 torchvision==0.13.1 -c pytorch pip install transformers==4.19.2 diffusers invisible-watermark pip install -e . ``` #### xformers efficient attention For more efficiency and speed on GPUs, we highly recommended installing the [xformers](https://github.com/facebookresearch/xformers) library. Tested on A100 with CUDA 11.4. Installation needs a somewhat recent version of nvcc and gcc/g++, obtain those, e.g., via ```commandline export CUDA_HOME=/usr/local/cuda-11.4 conda install -c nvidia/label/cuda-11.4.0 cuda-nvcc conda install -c conda-forge gcc conda install -c conda-forge gxx_linux-64==9.5.0 ``` Then, run the following (compiling takes up to 30 min). ```commandline cd .. git clone https://github.com/facebookresearch/xformers.git cd xformers git submodule update --init --recursive pip install -r requirements.txt pip install -e . cd ../stablediffusion ``` Upon successful installation, the code will automatically default to [memory efficient attention](https://github.com/facebookresearch/xformers) for the self- and cross-attention layers in the U-Net and autoencoder. ## General Disclaimer Stable Diffusion models are general text-to-image diffusion models and therefore mirror biases and (mis-)conceptions that are present in their training data. Although efforts were made to reduce the inclusion of explicit pornographic material, **we do not recommend using the provided weights for services or products without additional safety mechanisms and considerations. The weights are research artifacts and should be treated as such.** Details on the training procedure and data, as well as the intended use of the model can be found in the corresponding [model card](https://huggingface.co/stabilityai/stable-diffusion-2). The weights are available via [the StabilityAI organization at Hugging Face](https://huggingface.co/StabilityAI) under the [CreativeML Open RAIL++-M License](LICENSE-MODEL). ## Stable Diffusion v2 Stable Diffusion v2 refers to a specific configuration of the model architecture that uses a downsampling-factor 8 autoencoder with an 865M UNet and OpenCLIP ViT-H/14 text encoder for the diffusion model. The _SD 2-v_ model produces 768x768 px outputs. Evaluations with different classifier-free guidance scales (1.5, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0) and 50 DDIM sampling steps show the relative improvements of the checkpoints: ![sd evaluation results](assets/model-variants.jpg) ### Text-to-Image ![txt2img-stable2](assets/stable-samples/txt2img/merged-0003.png) ![txt2img-stable2](assets/stable-samples/txt2img/merged-0001.png) Stable Diffusion 2 is a latent diffusion model conditioned on the penultimate text embeddings of a CLIP ViT-H/14 text encoder. We provide a [reference script for sampling](#reference-sampling-script). #### Reference Sampling Script This script incorporates an [invisible watermarking](https://github.com/ShieldMnt/invisible-watermark) of the outputs, to help viewers [identify the images as machine-generated](scripts/tests/test_watermark.py). We provide the configs for the _SD2-v_ (768px) and _SD2-base_ (512px) model. First, download the weights for [_SD2.1-v_](https://huggingface.co/stabilityai/stable-diffusion-2-1) and [_SD2.1-base_](https://huggingface.co/s

评论收藏

内容反馈