High-performance speech-to-text, emotion detection, and demographic analysis API
Modulate Platform API provides enterprise-grade speech analysis with AI-powered transcription, emotion detection, and demographic insights. Built for scale with async processing and optimized for accuracy.
State-of-the-art ML models for accurate transcription and analysis
Optimized pipeline with 15-second chunking for parallel processing
Flexible pricing - only pay for the features you use
All API requests require authentication headers. Get your credentials from the dashboard.
accountuuid: 12345678-1234-1234-1234-123456789abc
apikey: abcdef12-3456-7890-abcd-ef1234567890
Content-Type: application/json
Get started in under 5 minutes with a simple transcription request:
Convert your audio file to base64 before submission.
Send a POST request to /api_service with your encoded audio.
Check the status with GET /api_service/job_status/{job_id} until your transcript is ready.
# 1. Submit audio for transcription
curl -X POST "https://cloud-processing-api.modulate.ai/api_service" \
-H "accountuuid: 12345678-1234-1234-1234-123456789abc" \
-H "apikey: abcdef12-3456-7890-abcd-ef1234567890" \
-H "Content-Type: application/json" \
-d '{
"audio_base64": "UklGRiQAAABXQVZFZm10...",
"file_extension": "wav",
"flags": {}
}'
# Response: {"job_id": "abc-123", "status": "processing"}
# 2. Check status (repeat until completed)
curl -X GET "https://cloud-processing-api.modulate.ai/api_service/job_status/abc-123" \
-H "accountuuid: 12345678-1234-1234-1234-123456789abc" \
-H "apikey: abcdef12-3456-7890-abcd-ef1234567890"
POST /api_service
{
"audio_base64": "base64-encoded-audio-data",
"file_extension": "mp3|wav|opus",
"flags": {
"emotion_annotation": true, // Optional: +1 API call
"demographics": true // Optional: +1 API call
},
"filename": "my-audio-file.mp3" // Optional
}
{
"job_id": "18cfd8ae-84f7-4f46-9a33-cecc7db432a9",
"status": "processing",
"api_calls_charged": 3,
"estimated_processing_time": "30-60 seconds",
"message": "Audio uploaded successfully. Use job_id to check status."
}
{
"error": "Invalid file format. Supported: mp3, wav, opus",
"code": "INVALID_FORMAT",
"details": {
"file_extension": "mp4",
"supported_formats": ["mp3", "wav", "opus"]
}
}
GET /api_service/job_status/{job_id}
| Parameter | Type | Description |
|---|---|---|
| job_id | string | Job identifier from submit response |
| include_chunks | boolean | Include detailed chunk results (optional) |
{
"job_id": "18cfd8ae-84f7-4f46-9a33-cecc7db432a9",
"status": "processing",
"progress": {
"total_chunks": 12,
"completed_chunks": 8,
"percentage": 67
},
"estimated_completion": "30 seconds"
}
{
"job_id": "18cfd8ae-84f7-4f46-9a33-cecc7db432a9",
"status": "completed",
"results": {
"full_transcription": "Hello world, this is a test audio file...",
"chunks": [
{
"chunk_id": "chunk_001",
"start_time_seconds": 0,
"duration_seconds": 15,
"transcription": "Hello world, this is a test",
"emotion": "0.95", // If emotion_annotation enabled
"shouting": "0.02", // If emotion_annotation enabled
"age": "adult", // If demographics enabled
"gender": "male" // If demographics enabled
}
]
},
"processing_time_seconds": 23.4,
"api_calls_charged": 3
}
High-accuracy speech-to-text
1 API call per audio file
Anger detection & shouting identification
+1 API call if enabled
Age group & gender classification
+1 API call if enabled
{
"flags": {
"emotion_annotation": true, // Enables anger + shouting detection
"demographics": true // Enables age + gender classification
}
}
// API call calculation:
// Base transcription: 1 call
// + emotion_annotation: +1 call
// + demographics: +1 call
// Total: 3 API calls
For the best developer experience, use our official Python SDK:
# Download SDK from dashboard or copy the AudioProcessingClient class
from audio_processing_sdk import AudioProcessingClient, AudioFeature
# Initialize client
client = AudioProcessingClient(
account_uuid="12345678-1234-1234-1234-123456789abc",
api_key="abcdef12-3456-7890-abcd-ef1234567890"
)
# Submit audio with features
job_id = client.submit_audio(
"my-audio.mp3",
features=[
AudioFeature.TRANSCRIPTION, # Always included (free)
AudioFeature.EMOTION, # +1 API call
AudioFeature.DEMOGRAPHICS # +1 API call
]
)
# Wait for completion with progress callback
def show_progress(result):
progress = result.progress
print(f"Progress: {progress['percentage']}%")
result = client.wait_for_completion(
job_id,
progress_callback=show_progress
)
# Access results
print(f"Transcription: {result.full_transcription}")
print(f"API calls used: {result.api_calls_consumed}")
for chunk in result.chunks:
print(f"Chunk {chunk.start_time_seconds}s: {chunk.transcription}")
if chunk.emotion:
print(f" Emotion: {chunk.emotion}")
import base64
import requests
import time
# Configuration
BASE_URL = "https://cloud-processing-api.modulate.ai"
ACCOUNT_UUID = "12345678-1234-1234-1234-123456789abc"
API_KEY = "abcdef12-3456-7890-abcd-ef1234567890"
headers = {
"accountuuid": ACCOUNT_UUID,
"apikey": API_KEY,
"Content-Type": "application/json"
}
# Read and encode audio file
with open("audio.mp3", "rb") as f:
audio_data = base64.b64encode(f.read()).decode()
# Submit for processing
payload = {
"audio_base64": audio_data,
"file_extension": "mp3",
"flags": {
"emotion_annotation": True,
"demographics": True
}
}
response = requests.post(f"{BASE_URL}/api_service",
json=payload, headers=headers)
job_data = response.json()
job_id = job_data["job_id"]
print(f"Job submitted: {job_id}")
# Poll for completion
while True:
status_response = requests.get(
f"{BASE_URL}/api_service/job_status/{job_id}",
headers=headers
)
result = status_response.json()
if result["status"] == "completed":
print("Processing complete!")
print(f"Transcription: {result['results']['full_transcription']}")
break
elif result["status"] == "failed":
print("Processing failed!")
break
else:
progress = result.get("progress", {})
print(f"Progress: {progress.get('percentage', 0)}%")
time.sleep(10)
#!/bin/bash
# Configuration
BASE_URL="https://cloud-processing-api.modulate.ai"
ACCOUNT_UUID="12345678-1234-1234-1234-123456789abc"
API_KEY="abcdef12-3456-7890-abcd-ef1234567890"
# Encode audio file
AUDIO_B64=$(base64 -i audio.mp3)
# Submit job
RESPONSE=$(curl -s -X POST "$BASE_URL/api_service" \
-H "accountuuid: $ACCOUNT_UUID" \
-H "apikey: $API_KEY" \
-H "Content-Type: application/json" \
-d "{
\"audio_base64\": \"$AUDIO_B64\",
\"file_extension\": \"mp3\",
\"flags\": {
\"emotion_annotation\": true,
\"demographics\": true
}
}")
JOB_ID=$(echo $RESPONSE | jq -r '.job_id')
echo "Job ID: $JOB_ID"
# Poll for completion
while true; do
STATUS=$(curl -s -X GET "$BASE_URL/api_service/job_status/$JOB_ID" \
-H "accountuuid: $ACCOUNT_UUID" \
-H "apikey: $API_KEY")
STATUS_VALUE=$(echo $STATUS | jq -r '.status')
if [ "$STATUS_VALUE" = "completed" ]; then
echo "Processing complete!"
echo $STATUS | jq '.results.full_transcription'
break
elif [ "$STATUS_VALUE" = "failed" ]; then
echo "Processing failed!"
break
else
PERCENTAGE=$(echo $STATUS | jq -r '.progress.percentage // 0')
echo "Progress: $PERCENTAGE%"
sleep 10
fi
done
const BASE_URL = "https://cloud-processing-api.modulate.ai";
const ACCOUNT_UUID = "12345678-1234-1234-1234-123456789abc";
const API_KEY = "abcdef12-3456-7890-abcd-ef1234567890";
const headers = {
'accountuuid': ACCOUNT_UUID,
'apikey': API_KEY,
'Content-Type': 'application/json'
};
async function processAudio(audioFile) {
// Convert file to base64
const reader = new FileReader();
const audioBase64 = await new Promise(resolve => {
reader.onload = () => resolve(reader.result.split(',')[1]);
reader.readAsDataURL(audioFile);
});
// Submit job
const submitResponse = await fetch(`${BASE_URL}/api_service`, {
method: 'POST',
headers: headers,
body: JSON.stringify({
audio_base64: audioBase64,
file_extension: audioFile.name.split('.').pop(),
flags: {
emotion_annotation: true,
demographics: true
}
})
});
const submitData = await submitResponse.json();
const jobId = submitData.job_id;
console.log(`Job submitted: ${jobId}`);
// Poll for completion
while (true) {
const statusResponse = await fetch(
`${BASE_URL}/api_service/job_status/${jobId}`,
{ headers }
);
const result = await statusResponse.json();
if (result.status === 'completed') {
console.log('Processing complete!');
console.log('Transcription:', result.results.full_transcription);
return result;
} else if (result.status === 'failed') {
throw new Error('Processing failed');
} else {
const progress = result.progress?.percentage || 0;
console.log(`Progress: ${progress}%`);
await new Promise(resolve => setTimeout(resolve, 10000));
}
}
}
// Usage with file input
document.getElementById('audioFile').addEventListener('change', async (e) => {
const file = e.target.files[0];
if (file) {
try {
const result = await processAudio(file);
console.log('Results:', result);
} catch (error) {
console.error('Error:', error);
}
}
});
| Code | Description | Common Causes |
|---|---|---|
| 200 | Success | Job status retrieved successfully |
| 202 | Accepted | Audio submitted for processing |
| 400 | Bad Request | Invalid file format, malformed JSON |
| 401 | Unauthorized | Invalid account UUID or API key |
| 404 | Not Found | Job ID not found |
| 413 | Payload Too Large | File exceeds 5MB limit |
| 429 | Too Many Requests | Quota exceeded or rate limited |
| 500 | Internal Server Error | Processing error, contact support |
{
"error": "Human-readable error message",
"code": "ERROR_CODE_CONSTANT",
"details": {
"field": "additional context",
"suggestion": "how to fix the issue"
},
"timestamp": "2025-06-15T16:50:55Z"
}
Visit your dashboard for account management and usage tracking.
For technical support or feature requests, contact our team.