Stable Diffusion 3 Large Text to Image

Stable Diffusion is a type of latent diffusion model that can generate images from text. It was created by a team of researchers and engineers from CompVis, Stability AI, and LAION. Stable Diffusion v2 is a specific version of the model architecture. It utilizes a downsampling-factor 8 autoencoder with an 865M UNet and OpenCLIP ViT-H/14 text encoder for the diffusion model. When using the SD 2-v model, it produces 768x768 px images. It uses the penultimate text embeddings from a CLIP ViT-H/14 text encoder to condition the generation process.

Playground API Pricing

Pricing

Serverless Pricing

Buy credits that can be used anywhere on Segmind

$ 0.01 /per gpu second

Stable Diffusion 3 Large Text-to-Image

Stable Diffusion 3 Large Text-to-Image (SD3 Large) is the latest and most advanced addition to the Stable Diffusion family of image-to-image models. The 8 billion parameter count in SD3 Large empowers it to tackle intricate tasks such as text understanding, typography, and generate highly detailed images. However, SD3 Large might require more powerful hardware to run smoothly. While optimized for performance, it may necessitate additional computational resources due to its larger size.