Elevenlabs Sound Generation

Eleven Labs' Sound Generation API provides a robust development tool for programmatically generating audio content using artificial intelligence. This API empowers developers and creators to integrate sound generation functionalities into their applications and workflows.

Playground

API

If you're looking for an API, you can choose from your desired programming language.

POST

import requests
import base64

# Use this function to convert an image file from the filesystem to base64
def image_file_to_base64(image_path):
    with open(image_path, 'rb') as f:
        image_data = f.read()
    return base64.b64encode(image_data).decode('utf-8')

# Use this function to fetch an image from a URL and convert it to base64
def image_url_to_base64(image_url):
    response = requests.get(image_url)
    image_data = response.content
    return base64.b64encode(image_data).decode('utf-8')

# Use this function to convert a list of image URLs to base64
def image_urls_to_base64(image_urls):
    return [image_url_to_base64(url) for url in image_urls]

api_key = "YOUR_API_KEY"
url = "https://api.segmind.com/v1/sound-generation"

# Request payload
data = {
  "text": "Looking good",
  "duration_seconds": 1,
  "prompt_influence": 0.3
}

headers = {'x-api-key': api_key}

response = requests.post(url, json=data, headers=headers)
print(response.content)  # The response is the generated image

RESPONSE

image/jpeg

HTTP Response Codes

200 - OKImage Generated

401 - UnauthorizedUser authentication failed

404 - Not FoundThe requested URL does not exist

405 - Method Not AllowedThe requested HTTP method is not allowed

406 - Not AcceptableNot enough credits

500 - Server ErrorServer had some issue with processing

Attributes

textstr *

The text that will get converted into a sound effect.

duration_secondsint ( default: 1 ) Affects Pricing

The duration of the sound which will be generated in seconds. Must be at least 0.5 and at most 22. If set to None we will guess the optimal duration using the prompt. Defaults to None.

min : 0.5,

max : 22

prompt_influenceint ( default: 1 ) Affects Pricing

A higher prompt influence makes your generation follow the prompt more closely while also making generations less variable. Must be a value between 0 and 1. Defaults to 0.3.

min : 0,

max : 1

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Sound Generation

Eleven Labs' Sound Generation API provides a robust development tool for programmatically generating audio content using artificial intelligence. This API allows developers and creators to integrate sound generation functionalities into their applications and workflows.

Core Functionalities of the Eleven Labs Sound Generation

Text-to-Sound Conversion: Transform textual descriptions of sounds into corresponding audio files. Users can specify desired sound types, durations, and intensity for precise control.
Custom Audio Synthesis: Generate unique audio samples based on user-defined parameters, enabling the creation of novel and specific sound effects.
Multilingual Support: Generate sound effects from text descriptions in various languages, expanding the reach and creative potential of audio projects.
Seamless Integration: Integrate the API into existing development environments for efficient audio generation within applications and games.

Benefits of Utilizing the Eleven Labs Sound Generation API

Enhanced Content Creation: Streamline sound effect generation within game development, video production, and other creative processes.
Efficient Workflow Integration: Integrate audio creation directly into development workflows, eliminating the need for separate sound design tools.
Scalable Audio Production: Generate large volumes of sound effects on-demand, facilitating efficient content creation.
Custom Audio Exploration: Experiment with user-defined parameters to explore new and unique sound design possibilities.
Multilingual Content Development: Create sound effects for a global audience by leveraging multilingual text descriptions.

Other Popular Models

sadtalker

Audio-based Lip Synchronization for Talking Head Video

face-to-many

Turn a face into 3D, emoji, pixel art, video game, claymation or toy

codeformer

CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.

sd2.1-faceswapper

Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training