Modulate Platform API

High-performance speech-to-text, emotion detection, and demographic analysis API

v1.0 REST API Async Processing

Overview

Modulate Platform API provides enterprise-grade speech analysis with AI-powered transcription, emotion detection, and demographic insights. Built for scale with async processing and optimized for accuracy.

๐ŸŽฏ High Accuracy

State-of-the-art ML models for accurate transcription and analysis

โšก Fast Processing

Optimized pipeline with 15-second chunking for parallel processing

๐Ÿ’ฐ Pay-per-Feature

Flexible pricing - only pay for the features you use

Supported Formats

  • Input: MP3, WAV, Opus (up to 5MB, ~5 minutes)
  • Processing: Automatic conversion to optimized Opus chunks
  • Languages: English (multi-language support coming soon)

Authentication

All API requests require authentication headers. Get your credentials from the dashboard.


accountuuid: 12345678-1234-1234-1234-123456789abc
apikey: abcdef12-3456-7890-abcd-ef1234567890
Content-Type: application/json
                

Quick Start

Get started in under 5 minutes with a simple transcription request:

1

Encode your audio

Convert your audio file to base64 before submission.

2

Submit job

Send a POST request to /api_service with your encoded audio.

3

Poll for results

Check the status with GET /api_service/job_status/{job_id} until your transcript is ready.

# 1. Submit audio for transcription
curl -X POST "https://cloud-processing-api.modulate.ai/api_service" \
  -H "accountuuid: 12345678-1234-1234-1234-123456789abc" \
  -H "apikey: abcdef12-3456-7890-abcd-ef1234567890" \
  -H "Content-Type: application/json" \
  -d '{
    "audio_base64": "UklGRiQAAABXQVZFZm10...",
    "file_extension": "wav",
    "flags": {}
  }'

# Response: {"job_id": "abc-123", "status": "processing"}

# 2. Check status (repeat until completed)
curl -X GET "https://cloud-processing-api.modulate.ai/api_service/job_status/abc-123" \
  -H "accountuuid: 12345678-1234-1234-1234-123456789abc" \
  -H "apikey: abcdef12-3456-7890-abcd-ef1234567890"

API Endpoints

POST Submit Audio for Processing

POST /api_service

Request Body
{
  "audio_base64": "base64-encoded-audio-data",
  "file_extension": "mp3|wav|opus",
  "flags": {
    "emotion_annotation": true,    // Optional: +1 API call
    "demographics": true          // Optional: +1 API call
  },
  "filename": "my-audio-file.mp3"  // Optional
}
Response
{
  "job_id": "18cfd8ae-84f7-4f46-9a33-cecc7db432a9",
  "status": "processing",
  "api_calls_charged": 3,
  "estimated_processing_time": "30-60 seconds",
  "message": "Audio uploaded successfully. Use job_id to check status."
}
{
  "error": "Invalid file format. Supported: mp3, wav, opus",
  "code": "INVALID_FORMAT",
  "details": {
    "file_extension": "mp4",
    "supported_formats": ["mp3", "wav", "opus"]
  }
}

GET Get Job Status & Results

GET /api_service/job_status/{job_id}

Parameters

Parameter Type Description
job_id string Job identifier from submit response
include_chunks boolean Include detailed chunk results (optional)
Response
{
  "job_id": "18cfd8ae-84f7-4f46-9a33-cecc7db432a9",
  "status": "processing",
  "progress": {
    "total_chunks": 12,
    "completed_chunks": 8,
    "percentage": 67
  },
  "estimated_completion": "30 seconds"
}
{
  "job_id": "18cfd8ae-84f7-4f46-9a33-cecc7db432a9",
  "status": "completed",
  "results": {
    "full_transcription": "Hello world, this is a test audio file...",
    "chunks": [
      {
        "chunk_id": "chunk_001",
        "start_time_seconds": 0,
        "duration_seconds": 15,
        "transcription": "Hello world, this is a test",
        "emotion": "0.95",      // If emotion_annotation enabled
        "shouting": "0.02",     // If emotion_annotation enabled
        "age": "adult",         // If demographics enabled
        "gender": "male"        // If demographics enabled
      }
    ]
  },
  "processing_time_seconds": 23.4,
  "api_calls_charged": 3
}

Features & Pricing

๐ŸŽ™๏ธ Transcription

High-accuracy speech-to-text

Always Included

1 API call per audio file

๐Ÿ˜ก Emotion Analysis

Anger detection & shouting identification

Optional

+1 API call if enabled

๐Ÿ‘ฅ Demographics

Age group & gender classification

Optional

+1 API call if enabled

Feature Flags

{
              "flags": {
                "emotion_annotation": true,    // Enables anger + shouting detection
                "demographics": true          // Enables age + gender classification
              }
            }

            // API call calculation:
            // Base transcription: 1 call
            // + emotion_annotation: +1 call  
            // + demographics: +1 call
            // Total: 3 API calls

Python SDK

For the best developer experience, use our official Python SDK:

# Download SDK from dashboard or copy the AudioProcessingClient class

SDK Usage

from audio_processing_sdk import AudioProcessingClient, AudioFeature

                  # Initialize client
                  client = AudioProcessingClient(
                      account_uuid="12345678-1234-1234-1234-123456789abc",
                      api_key="abcdef12-3456-7890-abcd-ef1234567890"
                  )

                  # Submit audio with features
                  job_id = client.submit_audio(
                      "my-audio.mp3",
                      features=[
                          AudioFeature.TRANSCRIPTION,    # Always included (free)
                          AudioFeature.EMOTION,          # +1 API call
                          AudioFeature.DEMOGRAPHICS      # +1 API call
                      ]
                  )

                  # Wait for completion with progress callback
                  def show_progress(result):
                      progress = result.progress
                      print(f"Progress: {progress['percentage']}%")

                  result = client.wait_for_completion(
                      job_id, 
                      progress_callback=show_progress
                  )

                  # Access results
                  print(f"Transcription: {result.full_transcription}")
                  print(f"API calls used: {result.api_calls_consumed}")

                  for chunk in result.chunks:
                      print(f"Chunk {chunk.start_time_seconds}s: {chunk.transcription}")
                      if chunk.emotion:
                          print(f"  Emotion: {chunk.emotion}")
                  

Code Examples

import base64
import requests
import time

# Configuration
BASE_URL = "https://cloud-processing-api.modulate.ai"
ACCOUNT_UUID = "12345678-1234-1234-1234-123456789abc"
API_KEY = "abcdef12-3456-7890-abcd-ef1234567890"

headers = {
    "accountuuid": ACCOUNT_UUID,
    "apikey": API_KEY,
    "Content-Type": "application/json"
}

# Read and encode audio file
with open("audio.mp3", "rb") as f:
    audio_data = base64.b64encode(f.read()).decode()

# Submit for processing
payload = {
    "audio_base64": audio_data,
    "file_extension": "mp3",
    "flags": {
        "emotion_annotation": True,
        "demographics": True
    }
}

response = requests.post(f"{BASE_URL}/api_service", 
                        json=payload, headers=headers)
job_data = response.json()
job_id = job_data["job_id"]

print(f"Job submitted: {job_id}")

# Poll for completion
while True:
    status_response = requests.get(
        f"{BASE_URL}/api_service/job_status/{job_id}",
        headers=headers
    )
    result = status_response.json()
    
    if result["status"] == "completed":
        print("Processing complete!")
        print(f"Transcription: {result['results']['full_transcription']}")
        break
    elif result["status"] == "failed":
        print("Processing failed!")
        break
    else:
        progress = result.get("progress", {})
        print(f"Progress: {progress.get('percentage', 0)}%")
        time.sleep(10)
#!/bin/bash

# Configuration
BASE_URL="https://cloud-processing-api.modulate.ai" 
ACCOUNT_UUID="12345678-1234-1234-1234-123456789abc"
API_KEY="abcdef12-3456-7890-abcd-ef1234567890"

# Encode audio file
AUDIO_B64=$(base64 -i audio.mp3)

# Submit job
RESPONSE=$(curl -s -X POST "$BASE_URL/api_service" \
  -H "accountuuid: $ACCOUNT_UUID" \
  -H "apikey: $API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"audio_base64\": \"$AUDIO_B64\",
    \"file_extension\": \"mp3\",
    \"flags\": {
      \"emotion_annotation\": true,
      \"demographics\": true
    }
  }")

JOB_ID=$(echo $RESPONSE | jq -r '.job_id')
echo "Job ID: $JOB_ID"

# Poll for completion
while true; do
  STATUS=$(curl -s -X GET "$BASE_URL/api_service/job_status/$JOB_ID" \
    -H "accountuuid: $ACCOUNT_UUID" \
    -H "apikey: $API_KEY")
  
  STATUS_VALUE=$(echo $STATUS | jq -r '.status')
  
  if [ "$STATUS_VALUE" = "completed" ]; then
    echo "Processing complete!"
    echo $STATUS | jq '.results.full_transcription'
    break
  elif [ "$STATUS_VALUE" = "failed" ]; then
    echo "Processing failed!"
    break
  else
    PERCENTAGE=$(echo $STATUS | jq -r '.progress.percentage // 0')
    echo "Progress: $PERCENTAGE%"
    sleep 10
  fi
done
const BASE_URL = "https://cloud-processing-api.modulate.ai";
const ACCOUNT_UUID = "12345678-1234-1234-1234-123456789abc";
const API_KEY = "abcdef12-3456-7890-abcd-ef1234567890";

const headers = {
    'accountuuid': ACCOUNT_UUID,
    'apikey': API_KEY,
    'Content-Type': 'application/json'
};

async function processAudio(audioFile) {
    // Convert file to base64
    const reader = new FileReader();
    const audioBase64 = await new Promise(resolve => {
        reader.onload = () => resolve(reader.result.split(',')[1]);
        reader.readAsDataURL(audioFile);
    });

    // Submit job
    const submitResponse = await fetch(`${BASE_URL}/api_service`, {
        method: 'POST',
        headers: headers,
        body: JSON.stringify({
            audio_base64: audioBase64,
            file_extension: audioFile.name.split('.').pop(),
            flags: {
                emotion_annotation: true,
                demographics: true
            }
        })
    });

    const submitData = await submitResponse.json();
    const jobId = submitData.job_id;
    console.log(`Job submitted: ${jobId}`);

    // Poll for completion
    while (true) {
        const statusResponse = await fetch(
            `${BASE_URL}/api_service/job_status/${jobId}`,
            { headers }
        );
        const result = await statusResponse.json();

        if (result.status === 'completed') {
            console.log('Processing complete!');
            console.log('Transcription:', result.results.full_transcription);
            return result;
        } else if (result.status === 'failed') {
            throw new Error('Processing failed');
        } else {
            const progress = result.progress?.percentage || 0;
            console.log(`Progress: ${progress}%`);
            await new Promise(resolve => setTimeout(resolve, 10000));
        }
    }
}

// Usage with file input
document.getElementById('audioFile').addEventListener('change', async (e) => {
    const file = e.target.files[0];
    if (file) {
        try {
            const result = await processAudio(file);
            console.log('Results:', result);
        } catch (error) {
            console.error('Error:', error);
        }
    }
});

Error Handling

HTTP Status Codes

Code Description Common Causes
200 Success Job status retrieved successfully
202 Accepted Audio submitted for processing
400 Bad Request Invalid file format, malformed JSON
401 Unauthorized Invalid account UUID or API key
404 Not Found Job ID not found
413 Payload Too Large File exceeds 5MB limit
429 Too Many Requests Quota exceeded or rate limited
500 Internal Server Error Processing error, contact support

Error Response Format

{
  "error": "Human-readable error message",
  "code": "ERROR_CODE_CONSTANT",
  "details": {
    "field": "additional context",
    "suggestion": "how to fix the issue"
  },
  "timestamp": "2025-06-15T16:50:55Z"
}

Rate Limits & Quotas

๐Ÿ“ File Limits

  • Max file size: 5MB
  • Max duration: ~5 minutes
  • Supported formats: MP3, WAV, Opus
  • Language: English only

โšก API Quotas

  • Monthly quota: Shown on dashboard
  • Concurrent jobs: 10 per account
  • Request rate: 60 requests/minute
  • Billing: Post-pay monthly
๐Ÿ’ก Pro Tips
  • Use only the features you need to minimize API calls
  • Implement exponential backoff for status polling
  • Cache results to avoid reprocessing the same audio
  • Monitor your quota usage on the dashboard
Need Help?

Visit your dashboard for account management and usage tracking.

For technical support or feature requests, contact our team.