Stable Diffusion 3 Large Text to Image

Stable Diffusion is a type of latent diffusion model that can generate images from text. It was created by a team of researchers and engineers from CompVis, Stability AI, and LAION. Stable Diffusion v2 is a specific version of the model architecture. It utilizes a downsampling-factor 8 autoencoder with an 865M UNet and OpenCLIP ViT-H/14 text encoder for the diffusion model. When using the SD 2-v model, it produces 768x768 px images. It uses the penultimate text embeddings from a CLIP ViT-H/14 text encoder to condition the generation process.

Playground

API

If you're looking for an API, you can choose from your desired programming language.

POST

import requests
import base64

# Use this function to convert an image file from the filesystem to base64
def image_file_to_base64(image_path):
    with open(image_path, 'rb') as f:
        image_data = f.read()
    return base64.b64encode(image_data).decode('utf-8')

# Use this function to fetch an image from a URL and convert it to base64
def image_url_to_base64(image_url):
    response = requests.get(image_url)
    image_data = response.content
    return base64.b64encode(image_data).decode('utf-8')

# Use this function to convert a list of image URLs to base64
def image_urls_to_base64(image_urls):
    return [image_url_to_base64(url) for url in image_urls]

api_key = "YOUR_API_KEY"
url = "https://api.segmind.com/v1/stable-diffusion-3-large-txt2img"

# Request payload
data = {
  "prompt": "A whimsical and high-resolution highly realistic image of a panda in a vintage cosmonaut suit. The panda is holding a sign that reads 'I love flying to the moon!' in playful lettering. The panda's helmet has a small propeller on top and a Indian flag patch, adding to the cosmic vibe. The background features a retro-styled spaceship with rockets and stars, giving the impression of a thrilling journey through space",
  "mode": "text-to-image",
  "aspect_ratio": "1:1",
  "output_format": "jpeg",
  "base64": False,
  "negative_prompt": "ugly, blurry, bad anatomy, blurred, watermark, grainy, signature, cut off, draft"
}

headers = {'x-api-key': api_key}

response = requests.post(url, json=data, headers=headers)
print(response.content)  # The response is the generated image

RESPONSE

image/jpeg

HTTP Response Codes

200 - OKImage Generated

401 - UnauthorizedUser authentication failed

404 - Not FoundThe requested URL does not exist

405 - Method Not AllowedThe requested HTTP method is not allowed

406 - Not AcceptableNot enough credits

500 - Server ErrorServer had some issue with processing

Attributes

promptstr ( default: 1 )

Prompt to render

modeenum:str ( default: text-to-image )

Type of mode.

Allowed values:

aspect_ratioenum:str ( default: 1:1 )

aspect_ratio

Allowed values:

output_formatenum:str ( default: jpeg )

Output format.

Allowed values:

base64boolean ( default: 1 )

Base64 encoding of the output image.

negative_promptstr ( default: 1 )

Prompts to exclude, eg. 'bad anatomy, bad hands'

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Stable Diffusion 3 Large Text-to-Image

Stable Diffusion 3 Large Text-to-Image (SD3 Large) is the latest and most advanced addition to the Stable Diffusion family of image-to-image models. The 8 billion parameter count in SD3 Large empowers it to tackle intricate tasks such as text understanding, typography, and generate highly detailed images. However, SD3 Large might require more powerful hardware to run smoothly. While optimized for performance, it may necessitate additional computational resources due to its larger size.