Kandinsky 2.2

Kandinsky inherits best practicies from Dall-E 2 and Latent diffusion, while introducing some new ideas.


API

If you're looking for an API, you can choose from your desired programming language.

POST
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 import requests import base64 # Use this function to convert an image file from the filesystem to base64 def image_file_to_base64(image_path): with open(image_path, 'rb') as f: image_data = f.read() return base64.b64encode(image_data).decode('utf-8') # Use this function to fetch an image from a URL and convert it to base64 def image_url_to_base64(image_url): response = requests.get(image_url) image_data = response.content return base64.b64encode(image_data).decode('utf-8') api_key = "YOUR_API_KEY" url = "https://api.segmind.com/v1/kandinsky2.2-txt2img" # Request payload data = { "prompt": "masterpiece, best quality, portrait of an old man, 50mm, solo, natural skin texture, realistic eye and face details, dark, deep shadow, darkness, moonlight, award winning photo, extremely detailed, fine detail, highly detailed, extremely detailed eyes and face, piercing red eyes, detailed clothes, skinny, gothic, native american clothing, analog film, stock photograph,", "negative_prompt": "lowres, text, error, cropped, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, out of frame, extra fingers, mutated hands", "samples": 1, "num_inference_steps": 25, "img_width": 512, "img_height": 768, "prior_steps": 25, "seed": 9863172, "base64": False } headers = {'x-api-key': api_key} response = requests.post(url, json=data, headers=headers) print(response.content) # The response is the generated image
RESPONSE
image/jpeg
HTTP Response Codes
200 - OKImage Generated
401 - UnauthorizedUser authentication failed
404 - Not FoundThe requested URL does not exist
405 - Method Not AllowedThe requested HTTP method is not allowed
406 - Not AcceptableNot enough credits
500 - Server ErrorServer had some issue with processing

Attributes


promptstr *

Prompt to render


negative_promptstr ( default: None )

Prompts to exclude, eg. 'bad anatomy, bad hands, missing fingers'


samplesint ( default: 1 ) Affects Pricing

Number of samples to generate.

min : 1,

max : 4


num_inference_stepsint ( default: 20 ) Affects Pricing

Number of denoising steps.

min : 20,

max : 100


img_widthenum:int ( default: 768 ) Affects Pricing

Image resolution.

Allowed values:


img_heightenum:int ( default: 768 ) Affects Pricing

Image resolution.

Allowed values:


prior_stepsint ( default: 25 )

Number of denoising steps.

min : 1,

max : 100


seedint ( default: -1 )

Seed for image generation.


base64boolean ( default: 1 )

Base64 encoding of the output image.

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Kandinsky 2.2

Kandinsky 2.2, a groundbreaking advancement over its predecessor, Kandinsky 2.1. With the integration of the powerful CLIP-ViT-G image encoder and the innovative ControlNet support, Kandinsky 2.2 is set to redefine the boundaries of aesthetic image creation and text comprehension.

At the heart of Kandinsky 2.2 lies the state-of-the-art CLIP-ViT-G image encoder, a transformative addition that amplifies the model's ability to craft visually stunning images while enhancing its text understanding capabilities. Complementing this is the ControlNet mechanism, a strategic inclusion designed to offer users unparalleled control over the image generation process.

Advantages

  1. Enhanced Image Aesthetics: The CLIP-ViT-G encoder ensures the generation of visually richer and more captivating images.

  2. Superior Text Understanding: With the new encoder, the model boasts an improved comprehension of text, bridging the gap between textual prompts and visual outputs.

  3. Precision Control: The ControlNet support empowers users to guide the image generation process, ensuring outputs that align with their vision.

  4. Optimized Performance: The combined power of CLIP-ViT-G and ControlNet results in a significant boost in the model's overall performance.

Use Cases

  1. Digital Art Creation: Artists can harness Kandinsky 2.2 to craft digital artworks that resonate with depth and detail.

  2. Content Generation: Ideal for content creators seeking to generate visuals based on textual prompts or narratives.

  3. Interactive Design: Designers can iteratively shape their designs, making real-time adjustments guided by text.

  4. Educational Tools: Can be integrated into learning platforms, allowing students to explore the interplay between text and visuals.

  5. Gaming and AR: Enhance user immersion in games or AR experiences by generating visuals based on in-game narratives or user prompts.

Kandinsky 2.2 License

Kandinsky 2.2's permissive license ensures that users, be they individual creators, businesses, or developers, can utilize the model for a myriad of commercial purposes without the constraints typically associated with restrictive licenses.