If you're looking for an API, you can choose from your desired programming language.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
import requests
api_key = "YOUR_API_KEY"
url = "https://api.segmind.com/v1/veo-2-image2video"
# Prepare data and files
data = {}
files = {}
data['seed'] = None
# For parameter "image", you can send a raw file or a URI:
# files['image'] = open('IMAGE_PATH', 'rb') # To send a file
data['image'] = 'https://segmind-resources.s3.amazonaws.com/input/65a44af9-de76-474e-898f-57e83b0ff3b3-a-photograph-of-a-giant-panda-swimming-i_k_2X05UAQYWIOoZJLuVpyg_eFEiq19KSYO9FzXR4tdibQ.jpeg' # To send a URI
data['prompt'] = "A photograph of a giant panda swimming in a crystal-clear outdoor pool. The panda is gracefully paddling with its black and white fur glistening in the sunlight, its playful expression clearly visible. The pool is surrounded by lush green foliage and colorful flowers, with a wooden deck leading to a grassy lawn. Soft, natural light bathes the scene, highlighting the water's clarity and the panda's movements."
data['duration'] = 5
data['aspect_ratio'] = "16:9"
headers = {'x-api-key': api_key}
# If no files, send as JSON
if files:
response = requests.post(url, data=data, files=files, headers=headers)
else:
response = requests.post(url, json=data, headers=headers)
print(response.content) # The response is the generated image
For new variations. Leave empty to use a random number.
Input image that acts as the starting frame for the generated video. Get best results with16:9 or 9:16 and 1280x720 or 720x1280, depending on the aspect ratio you choose.
Prompt for video generation
Number of second of video to be generated.
Allowed values:
Aspect ratio for the video.
Allowed values:
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
Google Veo 2, developed by Google DeepMind, is an advanced AI-powered video generation model that transforms static images into dynamic, high-quality videos. Launched as an upgrade to its predecessor, Veo, this model leverages cutting-edge AI to deliver realistic motion and cinematic visuals, making it a powerful tool for developers and creators looking to streamline video production. The model is accessible through Segmind now. It is poised to redefine creative workflows.
Veo 2 excels at converting images into videos with impressive realism, supporting resolutions up to 4K and durations exceeding two minutes (claimed by Google)—though current early access limits outputs to 720p and 8 seconds. It boasts advanced control over camera angles, lens types, and cinematic effects, allowing users to specify details like "low-angle tracking shot" or "18mm lens." The model’s enhanced understanding of real-world physics ensures natural movement, such as fluid dynamics or human expressions, making it ideal for lifelike video content.
In head-to-head comparisons on MovieGenBench, a dataset by Meta featuring 1,003 prompts, Veo 2 outperformed competitors like OpenAI’s Sora Turbo and Meta’s MovieGen. Human raters favored Veo 2 for overall preference and prompt adherence, with standout scores against Sora Turbo (58.8% preference) and Minimax (55.7% accuracy). Tested at 720p, Veo 2’s 8-second clips demonstrated superior detail and realism compared to shorter 5-second outputs from other models.
It struggles with maintaining consistency in complex scenes or intricate motions, occasionally producing artifacts like inconsistent textures or errors in human features (e.g., hands). Early access restrictions—capped resolution and duration—also limit its full potential, though future updates may address these. Complex prompts can sometimes overwhelm the model, leading to deviations from the intended output.
Veo 2 is versatile for developers and creators alike. Filmmakers can prototype scenes, marketers can craft engaging ads from product images, and educators can animate static visuals for lessons. Social media creators benefit from its ability to produce polished vlogs or influencer-style videos, while developers can integrate it into apps via Google Veo 2 APIs for automated video generation.
User feedback has been largely positive, with creators praising Veo 2’s realistic physics and prompt fidelity. User reviews highlight its image-to-video feature as a game-changer, though some note its higher cost compared to rivals. Early testers appreciate the natural results, like smooth transitions and lifelike movements, but a few criticize lingering inconsistencies, suggesting it’s not yet flawless. Overall, the creative community sees Veo 2 as a leap forward, eagerly awaiting broader access and refinements.
SDXL Img2Img is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to generate new images from an input image using StableDiffusionImg2ImgPipeline from diffusers
Best-in-class clothing virtual try on in the wild
Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training
CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.
We use cookies to enhance your browsing experience, analyze site traffic, and personalize content. By clicking "Accept all", you consent to our use of cookies.