LTX Video
LTX-Video is the first DiT-based video generation model capable of generating high-quality videos in real-time. It produces 24 FPS videos at a 768x512 resolution faster than they can be watched.
MiniMax AI (Hailuo)
With Video-01 by MiniMax, create high-definition videos at 720p resolution and 25fps, featuring cinematic camera movement effects based on text descriptions.
Hunyuan Video
Hunyuan AI Video is a new, state of the art, AI Video Generator that creates high-quality videos from text descriptions. With 13B parameters and state-of-the-art performance, it's the most powerful open-source video generation model available.
Luma Photon Flash Text to Image
Luma Photon flash is a powerful and fast text-to-image model offering high-quality visuals with unmatched speed and precision. Ideal for creatives, it excels in instruction-following, composition, and aesthetic quality, transforming ideas into stunning images
Luma Photon Text to Image
Luma Photon is a powerful AI-driven text-to-image model offering high-quality visuals with unmatched speed and precision. Ideal for creatives, it excels in instruction-following, composition, and aesthetic quality, transforming ideas into stunning images
AI Product Photography
Elevate your product imagery with our AI-powered photography model. Create stunning, professional-quality photos that boost engagement and sales. Perfect for e-commerce and digital marketing.
Flux Fill Pro
Professional inpainting and outpainting model with state-of-the-art performance. Edit or extend images with natural, seamless results.
Flux Depth Pro
Professional depth-aware image generation. Edit images while preserving spatial relationships.
Flux Canny Pro
Professional edge-guided image generation. Control structure and composition using Canny edge detection
Flux Depth Dev
Open-weight depth-aware image generation. Edit images while preserving spatial relationships.
Flux Canny Dev
Open-weight edge-guided image generation. Control structure and composition using Canny edge detection.
Flux Fill Dev
Open-weight inpainting model for editing and extending images. Guidance-distilled from FLUX.1 Fill Dev
Flux Redux Schnell
Fast, efficient image variation model for rapid iteration and experimentation.
Flux Redux Dev
Open-weight image variation model. Create new versions while preserving key elements of your original.
Flux-1.1 Pro Ultra
Create stunning visuals effortlessly with Flux 1.1 Pro Ultra. Experience unparalleled image quality and speed.
Mochi 1
Mochi 1 is a cutting-edge, open-source AI model that transforms text prompts into stunning, high-fidelity videos. Create captivating videos from simple text prompts with unparalleled quality and realism. Experience high-fidelity motion, strong prompt adherence, and limitless creative possibilities
Recraft V3
Recraft V3, the latest iteration of Recraft AI, offers a significant advancement in AI-driven image generation. This state-of-the-art model is designed to produce high-quality, detailed vector graphics, catering to the needs of designers, artists, and content creators alike.
Recraft V3 Svg
Recraft V3 SVG generates high-quality, customizable vector graphics with precision and ease. Perfect for logos, infographics, illustrations, and more.
Stable Diffusion 3.5 Turbo Text to Image
Stable Diffusion 3.5 Turbo offers exceptional customizability, efficient performance on consumer hardware, and diverse image outputs that accurately represent different skin tones and features, all while maintaining high-quality results and strong prompt adherence.
Stable Diffusion 3.5 Large Text to Image
Stable Diffusion 3.5 Large offers exceptional customizability, efficient performance on consumer hardware, and diverse image outputs that accurately represent different skin tones and features, all while maintaining high-quality results and strong prompt adherence.
Faceswap V3
Face Swap V3 is a cutting-edge tool that empowers you to seamlessly swap faces in images. With customizable features and advanced technology, you can achieve professional-quality results.
Video Audio Merge
Effortlessly merge audio and video with our intuitive Video Audio Merge model. Create stunning multimedia content with precise timing, fade effects, and customizable audio options. Perfect for content creators, filmmakers, and marketers.
Runway Gen Alpha Turbo Image to Video
Runway Gen-3 AlphaTurbo is a cutting-edge AI tool that transforms static images into dynamic videos with exceptional fidelity and motion
Kling AI Image to Video
Kling AI Image-to-Video is a powerful AI tool that transforms static images into captivating, animated videos. Create high-quality content effortlessly with Kling AI's advanced capabilities.
Kling AI Text to Video
Kling AI Text-to-Video is a cutting-edge AI tool that transforms text into stunning, lifelike videos. Create professional-quality content effortlessly with Kling AI's advanced capabilities.
Meta MusicGen Medium
MusicGen: Transform text into music with AI. Create unique, high-quality audio from simple descriptions. Experience the future of music generation with this innovative AI model.
Video Captioner
With Video Captioner create accurate, customizable subtitles for your videos effortlessly.
Face Detailer
Restore characters' faces to their original glory with Face Detailer. Enhance facial details, eliminate distortion, and upscale images for stunning results.
flux-1.1-pro
Flux 1.1 Pro is a cutting-edge image generation tool offering exceptional speed, quality, and customization. Ideal for digital artists, designers, and content creators.
MyShell Text To Speech
MyShell's Voice Cloning and Text to Speech - Transform your audio content with realistic, personalized voices. Experience high-quality, efficient, and cost-effective audio synthesis.
Video Stitch
Revolutionize your video editing with the Video Stitch Model. Seamlessly stitch clips, add captivating audio, and create professional-looking videos in minutes.
Simple Vector Flux Lora
Flux is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions
Ideogram Text To Image
Ideogram Text to Image: Turn your ideas into stunning visuals instantly with this powerful AI tool. Create captivating designs, realistic images, and more. Perfect for artists, designers, and anyone seeking creative inspiration.
Openvoice
OpenVoice is a versatile voice cloning model that supports multiple languages and offers precise tone replication, flexible style control, and zero-shot cross-lingual capabilities
Cog videoX Image To Video
CogVideoX image-to-video is a cutting-edge AI model that converts static images into dynamic, high-quality videos. Perfect for content creation, animation, and education, it offers high-resolution output, efficient inference, and versatile precision. Transform your images into engaging videos with CogVideoX
Expression Editor
Expression Editor uses reference images to accurately generate new images with desired expressions. Perfect for digital art, memes, and marketing.
Esrgan Video Upscaler
ESRGAN Video Upscaler: Experience sharper, clearer 4k videos with ESRGAN. This AI-powered video upscaler boosts resolution and reduces artifacts, making your video content look its best.
Consistent Character With Pose
Create images of a given character in different poses
Consistent Character AI Neolemon V3
Create consistent characters in any pose with AI
Luma Text-to-Video
Luma Video (Text to Video) is an advanced AI model that turns text prompts into captivating videos. Designed for creators and marketers, it offers high-resolution outputs, rapid processing, and cinematic quality, making video production accessible and efficient.
Luma Image-to-Video
With Luma's Dream Machine, transform your static images into dynamic videos. It offers high-fidelity video generation, rapid processing, and cinematic quality, enabling users to enhance their content creation process effortlessly.
Flux Pulid
Flux PuLID: Customize AI-generated images with your unique identity. Seamlessly integrate faces into text-to-image models for realistic and customizable results. High fidelity, tuning-free customization, and versatile editing options.
Flux Ipadapter
Flux IP Adapter is a cutting-edge AI model that lets you to create stunning, customized images. With its advanced style adaptation capabilities, Flux IP Adapter lets you seamlessly blend different artistic styles into your creations.
Flux Inpaint
Flux Inpainting is a powerful image editing tool designed to effortlessly edit and enhance your images. It's perfect for tasks like removing unwanted objects, restoring damaged photos, and creating artistic effects.
Flux Controlnets
Flux ControlNets is a collection of models that gives you precise control over image generation. By integrating ControlNet with Flux.1, these models enable you to create highly detailed and customized images with unprecedented accuracy.
OpenAI o1-mini
o1-mini by OpenAI provides high-performance reasoning and coding capabilities. Ideal for developers and businesses seeking advanced AI without the high costs.
OpenAI o1-preview
o1-preview by OpenAI, is a powerful AI model that can tackle complex problems with exceptional accuracy and efficiency. Ideal for researchers, developers, and scientists seeking advanced AI capabilities.
Text Overlay
Elevate your visuals withText Overlay Model. Easily add customized text to any image, perfect for social media, marketing, and blogs. Enjoy precise positioning, advanced styling, and seamless integration.
Cog Video X 5B
CogVideo is a groundbreaking AI model that turns text into high-quality videos. Create realistic scenes, animations, and more with ease. Ideal for content creators, educators, and businesses.
Fast Flux.1 Schnell
Fast Flux.1 Schnell by Segmind is an optimized text-to-image model designed for developers needing faster image generation. It offers high efficiency without compromising quality. Perfect for startups and engineers seeking quick, resource-efficient AI models.
Consistent Character AI Neolemon V2
Create consistent characters in any pose with AI
Flux Realism Lora with Upscale
Flux Realism Lora with upscale, developed by XLabs AI is a cutting-edge model designed to generate realistic images from textual descriptions.
Sam V2 Image
SAM v2, the next-gen segmentation model from Meta AI, revolutionizes computer vision. Building on SAM's success, it excels at accurately segmenting objects in images, offering robust and efficient solutions for various visual contexts.
Sam V2 Video
SAM v2 Video by Meta AI, allows promptable segmentation of objects in videos.
Consistent Character V1
Create images of a given character in different poses
Flux.1 Image To Image
Flux Image-To-Image model by Black Forest Labs is an advanced deep learning tool designed for transforming images based on specific textual prompts.
Easy Animate
Easy Animate is a state-of-the-art image to animation model to convert static images into dynamic animations with remarkable accuracy and fluidity.
Flux.1 Dev
Flux Dev is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions
Flux.1 Schnell
Flux Schnell is a state-of-the-art text-to-image generation model engineered for speed and efficiency.
Flux .1 Pro
Flux Pro is a state-of-the-art image generation with top of the line prompt following, visual quality, image detail and output diversity.
Text Embedding 3 Small
Text-embedding-3-small is a compact and efficient model developed for generating high-quality text embeddings. These embeddings are numerical representations of text data, enabling a variety of natural language processing (NLP) tasks such as semantic search, clustering, and text classification
Text Embedding 3 Large
Text-embedding-3-large is a robust language model by OpenAI designed for generating high-dimensional text embeddings for a wide range of natural language processing (NLP) tasks including semantic search, text clustering, and classification.
Realdream Pony V9
Real Dream Pony V9 is an advanced image generation model based on the Stable Diffusion XL (SDXL) architecture, excelling in photorealism.
AI Product Photo Editor
AI Product Photo Editor leverages advanced image-based ML techniques to generate high-quality product visuals using text prompts, product images, and background images.
RealDream Lightning
RealDream is a sophisticated image generation model utilizing SDXL Lightning architecture. It creates incredibly realistic images from textual prompts. With the ability to excellently generate human portraits from the user's descriptive text.
Llama 3.1 405b
Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.
Llama 3.1 70b
Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.
Llama 3.1 8b
Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.
Stable Diffusion 3 Medium Image to Image
Stable Diffusion 3 Medium image-to-image is a cutting-edge AI tool that uses advanced image-to-image technology to transform one image into another.
SD3 Medium Tile Controlnet
SD3 Medium Tile ControlNet is a large generative image model designed for generating detailed images based on textual prompts and tile-based input images.
SD3 Medium Canny Controlnet
Stable Diffusion 3 (SD3) Medium Canny ControlNet uses Canny edge detection to provide fine-grained control over the generated outputs.
SD3 Medium Pose Controlnet
Stable Diffusion 3 (SD3) Pose ControlNet is a large generative image model tailored for generating images based on text prompts while using pose information as guidance.
Motion Control SVD
Motion Control SVD is an innovative deep learning framework that breathes life into static images. By intelligently managing both camera and object motion, it empowers creators to achieve precise animation effects.
Live Portrait video to video
Experience the magic of Live Portrait’s Video-to-Video Model! Transform your static images into dynamic videos seamlessly.
Image Superimpose V2
Superimpose V2 elevates image editing! Seamlessly layer images with background removal, precise positioning, and flexible resizing options. Explore 14 blending modes to create stunning effects
Video Faceswap
Video Faceswap is a powerful tool for creators, filmmakers, and meme enthusiasts. With this innovative technology, you can effortlessly replace faces in videos
Aura Flow
Largest completely open sourced flow-based generation model that is capable of text-to-image generation
Live Portrait
Live Portrait animates static images using a reference driving video through implicit key point based framework, bringing a portrait to life with realistic expressions and movements. It identifies key points on the face (think eyes, nose, mouth) and manipulates them to create expressions and movements.
Dubbing
ElevenLabs Dubbing uses AI to translate your audio into multiple languages. Easily create multilingual versions of your content without studios or voice actors for each language
Claude 3 Haiku
Claude 3 Haiku, the fastest and most cost-effective model LLM from Anthropic, delivers instant responses and image analysis. Build interactive AI experiences that mimic human conversation. Perfect for various applications, from research to enterprise
Claude 3 Opus
Claude 3 Opus is an LLM pushing the limits of language understanding. It excels at complex tasks, generates human-quality text, and remembers vast amounts of information.
Gemini PRO
Gemini 1.5 Pro represents a significant leap in large language model technology, offering exceptional understanding and performance across different modalities and contexts.
Gemini Flash
Gemini 1.5 Flash is a game-changer for developers and enterprises seeking a speedy and cost-effective large language model with exceptional long-context understanding.
Claude 3.5 Sonnet
Claude 3.5 Sonnet represents a significant advancement in AI language models, combining speed, accuracy, and visual reasoning capabilities. It excels at understanding and completing requests thoughtfully, and does so much faster than previous versions. Additionally, it boasts a stronger vision model, allowing it to analyze visual data like charts and images with exceptional accuracy.
Kolors
Kolors is a cutting-edge text-to-image model that bridges language and visual art. Transform your textual ideas into photorealistic images with semantic precision.
Playground V2.5
Playground V2.5 is a diffusion-based text-to-image generative model, designed to create highly aesthetic images based on textual prompts.
Image Superimpose
Superimpose model lets you to create captivating visuals by seamlessly overlaying one image on top of another. It streamlines your image layering process, allowing you to bring your creative vision to life effortlessly.
SDXL Img2Img
SDXL Img2Img is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to generate new images from an input image using StableDiffusionImg2ImgPipeline from diffusers
SDXL Controlnet
SDXL ControlNet gives unprecedented control over text-to-image generation. SDXL ControlNet models Introduces the concept of conditioning inputs, which provide additional information to guide the image generation process
Story Diffusion
Story Diffusion turns your written narratives into stunning image sequences.
Elevenlabs Sound Generation
Eleven Labs' Sound Generation API provides a robust development tool for programmatically generating audio content using artificial intelligence. This API empowers developers and creators to integrate sound generation functionalities into their applications and workflows.
Elevenlabs Speech To Speech
Eleven Labs Speech-to-Speech offers AI-powered voice conversion for content creators, media professionals, and anyone seeking to modify or translate audio speech.
Elevenlabs Text To Speech
Eleven Labs Text-to-Speech (TTS) harnesses the power of deep learning to create realistic and engaging synthetic speech from written text.
Omni Zero
Omni-Zero: A diffusion pipeline for zero-shot stylized portrait creation.
LLAVA 1.6 7B
LLaVa translates images into text descriptions & captions.
LLaVA 13B
LLaVA 13B is a Vision-language model which allows both image and text as inputs.
Tooncrafter
Create videos from illustrated input images
V Express
V-Express lets you create portrait videos from single images.
SadTalker
Audio-based Lip Synchronization for Talking Head Video
Hallo
Hallo lets you create portrait videos from single images.
Relighting
Prompts to auto-magically relight your images.
Automatic Mask Generator
Automatic Mask Generator is a powerful tool that automates the creation of precise masks for inpainting
Magic Eraser
LaMA Object Removal- AI Magic Eraser
Inpaint Mask Maker
Real-Time Open-Vocabulary Object Detection
Background Eraser
Background Eraser helps in flawless background removal with exceptional accuracy.
Clarity Upscaler
High resolution creative image Upscaler and Enhancer. A free Magnific alternative.
Consistent Character
Create images of a given character in different poses
IDM VTON
Best-in-class clothing virtual try on in the wild
Stable Diffusion 3 Medium Text to Image
Stable Diffusion is a type of latent diffusion model that can generate images from text. It was created by a team of researchers and engineers from CompVis, Stability AI, and LAION. Stable Diffusion v2 is a specific version of the model architecture. It utilizes a downsampling-factor 8 autoencoder with an 865M UNet and OpenCLIP ViT-H/14 text encoder for the diffusion model. When using the SD 2-v model, it produces 768x768 px images. It uses the penultimate text embeddings from a CLIP ViT-H/14 text encoder to condition the generation process.
Fooocus
Fooocus enables high-quality image generation effortlessly, combining the best of Stable Diffusion and Midjourney.
IPAdapter Style Transfer
Style & Composition Transfer with Stable Diffusion IP Adapter
Profile Photo Style Transfer
Turn any image of a face into artwork using Stable Diffusion Controlnet and IPAdapter
illusion-diffusion-hq
Monster Labs QrCode ControlNet on top of SD Realistic Vision v5.1
PuLID
Novel tuning-free ID customization method for text-to-image generation.
Yamer's Realistic SDXL
The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.
GPT 4 turbo
GPT-4 outperforms both previous large language models and as of 2023, most state-of-the-art systems (which often have benchmark-specific training or hand-engineering). On the MMLU benchmark, an English-language suite of multiple-choice questions covering 57 subjects, GPT-4 not only outperforms existing models by a considerable margin in English, but also demonstrates strong performance in other languages. Currently points to gpt-4-turbo-2024-04-09.
GPT 4o
GPT-4o (“o” for “omni”) is our most advanced model. It is multimodal (accepting text or image inputs and outputting text), and it has the same high intelligence as GPT-4 Turbo but is much more efficient—it generates text 2x faster and is 50% cheaper. Additionally, GPT-4o has the best vision and performance across non-English languages of any of our models. GPT-4o is available in the OpenAI API to paying customers.
GPT 4
GPT-4 outperforms both previous large language models and as of 2023, most state-of-the-art systems (which often have benchmark-specific training or hand-engineering). On the MMLU benchmark, an English-language suite of multiple-choice questions covering 57 subjects, GPT-4 not only outperforms existing models by a considerable margin in English, but also demonstrates strong performance in other languages.
Mixtral 8x7b
Mistral MoE 8x7B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following.
Mixtral 8x22b
Mistral MoE 8x22B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following.
PuLID Lightning
Faster version of PuLID, a novel tuning-free face customization method for text-to-image generation
Fashion AI
This model is capable of editing clothing in an image using a premier clothing segmentation algorithm.
face-to-many
Turn a face into 3D, emoji, pixel art, video game, claymation or toy
face-to-sticker
Turn a face into a sticker
Llama 3 8b
Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.
material-transfer
Transfer a material from an image to a subject
Llama 3 70b
Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks.
Faceswap V2
Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training
Insta Depth
InstantID aims to generate customized images with various poses or styles from only a single reference ID image while ensuring high fidelity
Background Removal V2
This model removes the background image from any image
NewReality Lightning SDXL
NewReality Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.
DreamShaper Lightning SDXL
DreamShaper Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.
Colossus Lightning SDXL
Colossus Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.
Samaritan Lightning SDXL
Samaritan Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.
Realism Lightning SDXL
Realism Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.
ProtoVision Lightning SDXL
ProtoVision Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.
NightVis Lightning SDXL
NightVis Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.
WildCard Lightning SDXL
WildCard Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.
Dynavis Lightning SDXL
Dynavis Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.
Juggernaut Lightning SDXL
Juggernaut Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.
Realvis Lightning SDXL
Realvis Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.
Try-On Diffusion
Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on
Background Replace
This model is capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask
Fooocus Inpainting
Fooocus Inpainting is a powerful image generation model that allows you to selectively edit and enhance images.
Fooocus Outpainting
Fooocus Outpainting transforms ordinary images into extraordinary works of art by seamlessly expanding their boundaries.
InstantID
InstantID aims to generate customized images with various poses or styles from only a single reference ID image while ensuring high fidelity
Samaritan 3D XL
Samaritan 3D XL leverages the robust capabilities of the SDXL framework, ensuring high-quality, detailed 3D character renderings.
Stable Video Diffusion
Takes image as input and returns a video.
Segmind-Vega
The Segmind-Vega Model is a distilled version of the Stable Diffusion XL (SDXL), offering a remarkable 70% reduction in size and an impressive 100% speedup while retaining high-quality text-to-image generation capabilities.
Segmind-VegaRT
Segmind-VegaRT a distilled consistency adapter for Segmind-Vega that allows to reduce the number of inference steps to only between 2 - 8 steps.
IP-adapter Openpose XL
IP Adapter XL Openpose is built on the SDXL framework. This model integrates the IP Adapter and Openpose preprocessor to offer unparalleled control and guidance in creating context-rich images.
IP-adapter Canny XL
IP Adpater XL Canny is built on the SDXL framework. This model integrates the IP Adapter and Canny edge preprocessor to offer unparalleled control and guidance in creating context-rich images.
IP-adapter Depth XL
IP Adapter Depth XL is built on the SDXL framework. This model integrates the IP Adapter and Depth preprocessor to offer unparalleled control and guidance in creating context-rich images.
SDXL Inpaint
This model is capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask
SSD Img2Img
This model uses SSD-1B to generate images by passing a text prompt and an initial image to condition the generation
SDXL-Openpose
This model leverages SDXL to generate the images with ControlNet conditioned on Human Pose Estimation.
SSD-Depth
This model leverages SSD-1B to generate the images with ControlNet conditioned on Depth Estimation
SSD-Canny
This model leverages SSD-1B to generate the images with ControlNet conditioned on Canny Images
SSD-1B
The Segmind Stable Diffusion Model (SSD-1B) is a distilled 50% smaller version of the Stable Diffusion XL (SDXL), offering a 60% speedup while maintaining high-quality text-to-image generation capabilities. It has been trained on diverse datasets, including Grit and Midjourney scrape data, to enhance its ability to create a wide range of visual content based on textual prompts.
Copax Timeless SDXL
The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.
Zavychroma SDXL
The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.
Realvis SDXL
The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.
Dreamshaper SDXL
The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.
Stable Diffusion 2.1
Stable Diffusion is a type of latent diffusion model that can generate images from text. It was created by a team of researchers and engineers from CompVis, Stability AI, and LAION. Stable Diffusion v2 is a specific version of the model architecture. It utilizes a downsampling-factor 8 autoencoder with an 865M UNet and OpenCLIP ViT-H/14 text encoder for the diffusion model. When using the SD 2-v model, it produces 768x768 px images. It uses the penultimate text embeddings from a CLIP ViT-H/14 text encoder to condition the generation process.
Stable Diffusion XL 0.9
The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software.
Word2img
Create beautifully designed words using Segmind’s word to image for your marketing purposes
Segmind Tiny-SD
Convert Text into Images with the latest distilled stable diffusion model
Stable Diffusion Inpainting
Stable Diffusion Inpainting is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask
Stable Diffusion img2img
This model uses diffusion-denoising mechanism as first proposed by SDEdit, Stable Diffusion is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to generate new images from an input image using StableDiffusionImg2ImgPipeline from diffusers
Segmind Small-SD
Create realistic portrait images using the finetined Segmind Tiny SD model. Segmind Tiny SD (Portrait) Serverless APIs, Segmind offers fastest deployment for Tiny-Stable-Diffusion inferences
Stable Diffusion XL 1.0
The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software
Scifi
The most versatile photorealistic model that blends various models to achieve the amazing realistic images.
Samaritan
The most versatile photorealistic model that blends various models to achieve the amazing realistic images
RPG
This model corresponds to the Stable Diffusion RPG checkpoint for detailed images at the cost of a super detailed prompt
Reliberate
This model corresponds to the Stable Diffusion Reliberate checkpoint for detailed images at the cost of a super detailed prompt
Realistic Vision
This model corresponds to the Stable Diffusion Realistic Vision checkpoint for detailed images at the cost of a super detailed prompt
RCNZ - Cartoon
The most versatile photorealistic model that blends various models to achieve the amazing realistic images
Paragon
This model corresponds to the Stable Diffusion Paragon checkpoint for detailed images at the cost of a super detailed prompt
SD Outpainting
Stable Diffusion Outpainting can extend any image in any direction
Manmarumix
The most versatile photorealistic model that blends various models to achieve the amazing realistic images.
Majicmix
The most versatile photorealistic model that blends various models to achieve the amazing realistic images.
Juggernaut Final
The most versatile photorealistic model that blends various models to achieve the amazing realistic images.
Fruit Fusion
The most versatile photorealistic model that blends various models to achieve the amazing realistic images.
Flat 2d
The most versatile photorealistic model that blends various models to achieve the amazing realistic images.
Fantassified Icons
The most versatile photorealistic model that blends various models to achieve the amazing realistic images.
Epic Realism
This model corresponds to the Stable Diffusion Epic Realism checkpoint for detailed images at the cost of a super detailed prompt
Edge of Realism
This model corresponds to the Stable Diffusion Edge of Realism checkpoint for detailed images at the cost of a super detailed prompt
DvArch
The most versatile photorealistic model that blends various models to achieve the amazing realistic images.
Dream Shaper
Dreamshaper excels in delivering high-quality, detailed images. It is fine-tuned to understand and interpret a diverse range of artistic styles and subjects.
Deep Spaced Diffusion
The most versatile photorealistic model that blends various models to achieve the amazing realistic space themed images.
Cyber Realistic
The most versatile photorealistic model that blends various models to achieve the amazing realistic images.
Cute Rich Style
The most versatile photorealistic model that blends various models to achieve the amazing realistic images.
Colorful
This model corresponds to the Stable Diffusion Colorful checkpoint for detailed images at the cost of a super detailed prompt
All in one pixe
The most versatile photorealistic model that blends various models to achieve the amazing realistic images.
526mix
The most versatile photorealistic model that blends various models to achieve the amazing realistic images.
QR Generator
Create beautiful and creative QR codes for your marketing campaigns.
Segmind Tiny-SD (Portrait)
Convert text to images with the distilled stable diffusion model by Segmind, Small-SD. Segmind Small SD Serverless APIs, Segmind offers fastest deployment for Small-Stable-Diffusion inferences.
Kandinsky 2.1
Kandinsky inherits best practices from Dall-E 2 and Latent diffusion, while introducing some new ideas.
ControlNet Soft Edge
This model corresponds to the ControlNet conditioned on Soft Edge.
ControlNet Scribble
This model corresponds to the ControlNet conditioned on Scribble images.
ControlNet Depth
This model corresponds to the ControlNet conditioned on Depth estimation.
ControlNet Canny
This model corresponds to the ControlNet conditioned on Canny edges.
Codeformer
CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.
Segment Anything Model
The Segment Anything Model (SAM) produces high quality object masks from input prompts such as points or boxes, and it can be used to generate masks for all objects in an image.
Faceswap
Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training
Revanimated
This model corresponds to the Stable Diffusion Revanimated checkpoint for detailed images at the cost of a super detailed prompt
Background Removal
This model removes the background image from any image
ESRGAN
AI-Powered Image Super-Resolution, upscaling and Image enhancement producing stunning, high-quality results using artificial intelligence
ControlNet Openpose
This model corresponds to the ControlNet conditioned on Human Pose Estimation.