If you're looking for an API, you can choose from your desired programming language.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
import requests
import base64
# Use this function to convert an image file from the filesystem to base64
def image_file_to_base64(image_path):
with open(image_path, 'rb') as f:
image_data = f.read()
return base64.b64encode(image_data).decode('utf-8')
# Use this function to fetch an image from a URL and convert it to base64
def image_url_to_base64(image_url):
response = requests.get(image_url)
image_data = response.content
return base64.b64encode(image_data).decode('utf-8')
api_key = "YOUR_API_KEY"
url = "https://api.segmind.com/v1/live-portrait-video-to-video"
# Request payload
data = {
"input_video": "https://segmind-sd-models.s3.amazonaws.com/display_images/live-portrait-v2v/live_portrait_v2v_input_vid.mp4",
"driving_video": "https://segmind-sd-models.s3.amazonaws.com/display_images/live-portrait-v2v/livie_portrait_driving_vid.mp4",
"dsize": 512,
"scale": 2.3,
"driving_audio": False,
"vx_ratio": 0,
"vy_ratio": -0.125,
"input_face_index": 0,
"drive_face_index": 0,
"crop_drive_face": False,
"lip_zero": True,
"lip_zero_threshold": 0.03,
"eye_retargeting": False,
"eyes_retargeting_multiplier": 1,
"lip_retargeting": False,
"lip_retargeting_multiplier": 1,
"stitching": True,
"relative": True,
"mismatch_method": "cut",
"video_frame_load_cap": 120,
"base64": False
}
headers = {'x-api-key': api_key}
response = requests.post(url, json=data, headers=headers)
print(response.content) # The response is the generated image
Input video
Driving video
Size of the video
min : 64,
max : 2048
Scale of the video
min : 1,
max : 4
Set to 'true' to return the audio of the driving video; if 'false', the audio of input video will be returned.
Horizontal ratio for transformation
min : -1,
max : 1
Vertical ratio for transformation
min : -1,
max : 1
Index of the input face
min : 0,
max : 5
Index of the driving face
min : 0,
max : 5
Crop the driving face
Zero out the lips
Threshold for zeroing out the lips
min : 0,
max : 5
Enable eye retargeting
Multiplier for eye retargeting
min : 0.01,
max : 10
Enable lip retargeting
Multiplier for lip retargeting
min : 0.01,
max : 10
Enable stitching
Use relative method
Method for mismatch handling
Allowed values:
The maximum number of frames to load from the driving video. Set to 0 to use all frames.
Base64 encoding of the output image.
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
The Live Portrait Video Model is a robust deep learning tool designed to facilitate the generation of realistic video portraits. This model leverages advanced neural network architectures to convert an input video of a subject and a driving video into a seamlessly animated output. It captures subtle facial expressions and movements, ensuring that the resulting video maintains a high level of accuracy and realism.
Input Video: Upload the source video featuring the subject whose portrait you want to animate.
Driving Video: Upload the driving video that contains the desired expressions and movements.
Generate Output: Click on the "Generate" button to create the animated portrait video.
The model offers several parameters that can be fine-tuned to achieve desired outputs:
Input Face Index
Utilize the Input Face Index when the input video contains multiple faces, and you need to specify which face to animate. Identify the index of the face in the input video. Set the Input Face Index
parameter to the corresponding index number.
Drive Face Index
Similar to the Input Face Index, use the Drive Face Index when the driving video contains multiple faces. Determine the index of the desired face in the driving video. Set the Drive Face Index
parameter to the appropriate index number
Mismatch Method
Use the Mismatch Method when there are significant differences between the facial features of the input and driving videos. Choose from available options like 'cut', 'blend', etc. Select the method that best handles the discrepancies ensuring smooth animation.
Video Frame Load Gap
Use this option to manage the trade-off between processing speed and video smoothness. Set a lower value for smoother animation. Set a higher value to speed up processing if small frame gaps are acceptable.
Crop Drive Face
Use Crop Drive Face when you need to focus on a particular area of the driving face, removing unnecessary elements. The model will automatically crop the driving face.
Lip Zero
Use the Lip Zero parameter to control the sensitivity of lip movements in the generated video. Adjust the Lip Zero Threshold
to the desired sensitivity level. Lower values make small lip movements more prominent.
Eye Retargeting
Enable eye retargeting to achieve more realistic eye movements in the animated portrait. Set the Eyes Retargeting Multiplier
to control the intensity of eye movements.
Lip Retargeting
Enable lip retargeting to replicate the driving video’s lip movements accurately in the animated portrait. Set the Lip Retargeting Multiplier
to adjust the extent of lip movement replication.
Stitching
Use stitching to smooth transitions between frames, particularly when there are noticeable seams or discontinuities. The model will apply stitching techniques to create seamless transitions.
Relative Method
Utilize the Relative Method for finer control over the animation, typically when exact alignment with the driving video is not required. The model will use relative positioning to make nuanced adjustments, allowing for more fluid animation.
For optimal results, adjust these parameters based on the specific requirements of your project and the characteristics of your input and driving videos.
The Live Portrait Video Model is ideal for applications in the fields of animation, entertainment, and communication. It can be used for creating animated portraits for films, games, virtual influencers, educational content, and more. The model ensures high-quality and realistic animations that can enhance user engagement in various multimedia projects.
SDXL Img2Img is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to generate new images from an input image using StableDiffusionImg2ImgPipeline from diffusers
Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training
CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.
Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training