If you're looking for an API, you can choose from your desired programming language.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import requests
import base64
# Use this function to convert an image file from the filesystem to base64
def image_file_to_base64(image_path):
with open(image_path, 'rb') as f:
image_data = f.read()
return base64.b64encode(image_data).decode('utf-8')
# Use this function to fetch an image from a URL and convert it to base64
def image_url_to_base64(image_url):
response = requests.get(image_url)
image_data = response.content
return base64.b64encode(image_data).decode('utf-8')
api_key = "YOUR_API_KEY"
url = "https://api.segmind.com/v1/sound-generation"
# Request payload
data = {
"text": "Looking good",
"duration_seconds": 1,
"prompt_influence": 0.3
}
headers = {'x-api-key': api_key}
response = requests.post(url, json=data, headers=headers)
print(response.content) # The response is the generated image
The text that will get converted into a sound effect.
The duration of the sound which will be generated in seconds. Must be at least 0.5 and at most 22. If set to None we will guess the optimal duration using the prompt. Defaults to None.
min : 0.5,
max : 22
A higher prompt influence makes your generation follow the prompt more closely while also making generations less variable. Must be a value between 0 and 1. Defaults to 0.3.
min : 0,
max : 1
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
Eleven Labs' Sound Generation API provides a robust development tool for programmatically generating audio content using artificial intelligence. This API allows developers and creators to integrate sound generation functionalities into their applications and workflows.
Text-to-Sound Conversion: Transform textual descriptions of sounds into corresponding audio files. Users can specify desired sound types, durations, and intensity for precise control.
Custom Audio Synthesis: Generate unique audio samples based on user-defined parameters, enabling the creation of novel and specific sound effects.
Multilingual Support: Generate sound effects from text descriptions in various languages, expanding the reach and creative potential of audio projects.
Seamless Integration: Integrate the API into existing development environments for efficient audio generation within applications and games.
Enhanced Content Creation: Streamline sound effect generation within game development, video production, and other creative processes.
Efficient Workflow Integration: Integrate audio creation directly into development workflows, eliminating the need for separate sound design tools.
Scalable Audio Production: Generate large volumes of sound effects on-demand, facilitating efficient content creation.
Custom Audio Exploration: Experiment with user-defined parameters to explore new and unique sound design possibilities.
Multilingual Content Development: Create sound effects for a global audience by leveraging multilingual text descriptions.
Audio-based Lip Synchronization for Talking Head Video
Turn a face into 3D, emoji, pixel art, video game, claymation or toy
CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.
Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training