ElevenLabs Dubbing

ElevenLabs Dubbing uses AI to translate your audio into multiple languages. Easily create multilingual versions of your content without studios or voice actors for each language

Playground

API

If you're looking for an API, you can choose from your desired programming language.

POST

import requests
import base64

# Use this function to convert an image file from the filesystem to base64
def image_file_to_base64(image_path):
    with open(image_path, 'rb') as f:
        image_data = f.read()
    return base64.b64encode(image_data).decode('utf-8')

# Use this function to fetch an image from a URL and convert it to base64
def image_url_to_base64(image_url):
    response = requests.get(image_url)
    image_data = response.content
    return base64.b64encode(image_data).decode('utf-8')

# Use this function to convert a list of image URLs to base64
def image_urls_to_base64(image_urls):
    return [image_url_to_base64(url) for url in image_urls]

api_key = "YOUR_API_KEY"
url = "https://api.segmind.com/v1/dubbing"

# Request payload
data = {
  "source_url": "https://segmind-sd-models.s3.amazonaws.com/display_images/dubbing-op.mp3",
  "source_lang": "auto",
  "target_lang": "hi"
}

headers = {'x-api-key': api_key}

response = requests.post(url, json=data, headers=headers)
print(response.content)  # The response is the generated image

RESPONSE

image/jpeg

HTTP Response Codes

200 - OKImage Generated

401 - UnauthorizedUser authentication failed

404 - Not FoundThe requested URL does not exist

405 - Method Not AllowedThe requested HTTP method is not allowed

406 - Not AcceptableNot enough credits

500 - Server ErrorServer had some issue with processing

Attributes

source_urlstr *

Input Audio URL

source_langenum:str ( default: auto )

Source Language

Allowed values:

target_langenum:str ( default: hi )

Target Language

Allowed values:

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Dubbing

ElevenLabs Dubbing is an AI model to translate and dub audio content. It streamlines the process of making your audio multilingual, allowing you to reach a wider audience without needing traditional recording studios or voice actors for each target language.

Using ElevenLabs Dubbing

Audio Input: Upload audio files directly.
Language Selection: The model can automatically identify the source language of your audio. You can also manually choose from a list of supported languages. The model supports 29 languages, you can dub your content between any pair of these languages.
Target Language Selection: Select the language you want your audio translated into. ElevenLabs offers 29 languages at present: Chinese, Korean, Dutch, Turkish, Swedish, Indonesian, Filipino, Japanese, Ukrainian, Greek, Czech, Finnish, Romanian, Russian, Danish, Bulgarian, Malay, Slovak, Croatian, Classic Arabic, Tamil, English, Polish, German, Spanish, French, Italian, Hindi and Portuguese.
AI-powered Dubbing: The model will translate the audio content while attempting to match the speaker's voice characteristics, intonation, and emotional delivery in the target language.

Benefits of Using ElevenLabs Dubbing

Simplified Workflow: Eliminate the need for traditional dubbing studios and voice actors for each target language. Translate and dub your audio content efficiently within a single platform.
Multilingual Reach: Expand the reach of your audio content by making it accessible to audiences speaking different languages.
Cost-effective Solution: Potentially reduce production costs associated with traditional dubbing methods.
Time-saving: Streamline your audio translation and dubbing process compared to conventional methods.

Other Popular Models

sdxl-img2img

SDXL Img2Img is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to generate new images from an input image using StableDiffusionImg2ImgPipeline from diffusers

sdxl-controlnet

SDXL ControlNet gives unprecedented control over text-to-image generation. SDXL ControlNet models Introduces the concept of conditioning inputs, which provide additional information to guide the image generation process

faceswap-v2

Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training

codeformer

CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.