Stable Diffusion 3 Medium Text to Image
Stable Diffusion is a type of latent diffusion model that can generate images from text. It was created by a team of researchers and engineers from CompVis, Stability AI, and LAION. Stable Diffusion v2 is a specific version of the model architecture. It utilizes a downsampling-factor 8 autoencoder with an 865M UNet and OpenCLIP ViT-H/14 text encoder for the diffusion model. When using the SD 2-v model, it produces 768x768 px images. It uses the penultimate text embeddings from a CLIP ViT-H/14 text encoder to condition the generation process.
API
If you're looking for an API, you can choose from your desired programming language.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
import requests
import base64
# Use this function to convert an image file from the filesystem to base64
def image_file_to_base64(image_path):
with open(image_path, 'rb') as f:
image_data = f.read()
return base64.b64encode(image_data).decode('utf-8')
# Use this function to fetch an image from a URL and convert it to base64
def image_url_to_base64(image_url):
response = requests.get(image_url)
image_data = response.content
return base64.b64encode(image_data).decode('utf-8')
# Use this function to convert a list of image URLs to base64
def image_urls_to_base64(image_urls):
return [image_url_to_base64(url) for url in image_urls]
api_key = "YOUR_API_KEY"
url = "https://api.segmind.com/v1/stable-diffusion-3-medium-txt2img"
# Request payload
data = {
"prompt": "A whimsical and high-resolution highly realistic image of a panda in a vintage cosmonaut suit. The panda is holding a sign that reads 'I love flying to the moon!' in playful lettering. The panda's helmet has a small propeller on top and a Indian flag patch, adding to the cosmic vibe. The background features a retro-styled spaceship with rockets and stars, giving the impression of a thrilling journey through space",
"negative_prompt": "bad quality, poor quality, doll, disfigured, jpg, toy, bad anatomy, missing limbs, missing fingers, 3d, cgi",
"samples": 1,
"scheduler": "DPM++ 2M",
"num_inference_steps": 25,
"guidance_scale": 5,
"denoise": 1,
"seed": 468685,
"img_width": 1024,
"img_height": 1024,
"modelsamplingsd3_shift": 3,
"conditioningsettimesteprange_start": 0.1,
"conditioningsettimesteprange_stop": 1,
"base64": False
}
headers = {'x-api-key': api_key}
response = requests.post(url, json=data, headers=headers)
print(response.content) # The response is the generated image
Attributes
Prompt to render
Prompts to exclude, eg. 'bad anatomy, bad hands, missing fingers'
Number of samples to generate.
min : 1,
max : 4
Type of scheduler.
Allowed values:
Number of denoising steps.
min : 10,
max : 100
Scale for classifier-free guidance
min : 1,
max : 25
How much to transform the reference image
min : 0.1,
max : 1
Seed for image generation.
min : -1,
max : 999999999999999
Image width can be between 512 and 2048 in multiples of 8
Image height can be between 512 and 2048 in multiples of 8
Model Sampling SD3 Shift
min : 1,
max : 10
Conditioning set timestep range start
min : 0.1,
max : 1
Conditioning set timestep range stop
min : 0.1,
max : 1
Base64 encoding of the output image.
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
Stable Diffusion 3 Medium Text-to-Image
Stable Diffusion 3 Medium Text-to-Image (SD3 Medium) is the latest and most advanced addition to the Stable Diffusion family of image-to-image models. SD3 text-to-image Medium is designed to be more resource-efficient, making it a better choice for users with limited computational resources. Due to its smaller size, SD3 Medium can run efficiently on consumer-grade hardware, including consumer PCs and laptops, as well as enterprise-tier GPUs. SD3 Medium is designed to be more resource-efficient, making it a better choice for users with limited computational resources.
Stable Diffusion 3 Medium Text-to-Image Capabilities
SD3 Medium crafts stunningly realistic images, breaking new ground in photorealistic generation. It also tackles intricate prompts with multiple subjects, even if you have a typo or two. SD3 Medium incorporates typography within your images with unparalleled precision, making your message shine.
Other Popular Models
fooocus
Fooocus enables high-quality image generation effortlessly, combining the best of Stable Diffusion and Midjourney.

instantid
InstantID aims to generate customized images with various poses or styles from only a single reference ID image while ensuring high fidelity

sdxl1.0-txt2img
The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software

codeformer
CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.
